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TECHNICAL FIELD 
This invention relates to nucleic add and amino add sequ^ices of proteases and to the use of 
these sequences in bydrolysis of pq>tide bonds and in the diagnosis, treatment, and prevention of 
gastrointestinal, cardiovascular, autoimmune^inflanmiatory, cdl proliferative, developmental, epithdial, 
neurological, and reproductive disOTders, and in the assessment of the effects of exogenous con^Kyunds 
on the expression of nucldc acid and amino add sequences of proteases. 

BACKGROUND OF THE INVENTION 
Proteases cleave protdns and peptides at the pq)tide bond that forms the backlx>ne of the 
protein or peptide diain. Proteolysis is one of the most inqxHtant and frequent enzymatic reactions that 
occurs both within and outside of cells. Proteolysis is responsible for the activation and maturation of 
nascent polypeptides, the d^adation of misfcdded and damaged proteins, and the controlled turnover of 
peptides within the celL Proteases parttdpate in digestion, endocrine function, and tissue remodeling 
during onbryomc (teveiopment, wound healing, and normal growth. Proteases can play a role in 
regulatory processes by affecting the half life of regulatoiy protdns. Proteases are involved in the 
etiology or progression of disease states such as inflammation, angiogenesis, tumor disposion and 
metastasis, cardiovascular disease, neurological disease, aiKi bacterial, parasitic, and viral infections. 

Proteases can be categorized on the basis of where they deave thdr substrates. Exopeptidases, 
which include aminopeptidases, dipeptidyl peptidases, tripeptidases, carboxypeptidases, pqrtidyl-di- 
peptidases. dipeptidases, and om^a pq)tidases, deave residues at the termini of their substrates. 
Endopqitidases, induding serine proteases, cysteine proteases, and metalloproteases, cleave at residues 
within the peptide. Four princyal categmies of mammalian proteases have been identified based on 
active site structure, mechanism of action, and overall three-dimensional structure. (See Beynon, R. J. 
and J.S. Bond (1994) Proteolvtic Enzvroes: A Practical Approach, Oxford University Press, New YOTk 
NY, pp. 1-5.) 
Serine Proteases 

The serine proteases (SPs) are a large, widespread family of proteolytic enzymes that indude 
the digestive enzymes trypsin and chymotrypsin, components of the complesnent and blood-dotting 
cascades, and enzymes that control the ctetgradation and turnover of macromolecules within the ceU and 
in the extracellular matrix. Most of the more than 20 subfamilies can be grouped into six clans, each 
with a common ancestw. These six clans are hypothesized to have descended from at least four 
evolutionarily distiiK:t ancestors. SPs are named for the presence of a serine residue found in the active 
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catalytic site of most families. The active site is defined by the catalytic triad, a of conserved 
aspamgine, histidine» and senne residues caitical for ca^ These residues forma charge relay 
network that facilitates substrate binding. Other residues outside the active site form an oxyanion hole 
that stabilizes the tetrahedral transition intermediate formed during catalysis. SPs have a wide range of 
5 substrates and can be subdivided into subfamiMes on the basis of their substrate specificity^ Themain 
subfamilies are named for the residue(s) after v^ch they cleave: trypases (after arginioe or lysine), 
aspases (after aspartate), chymases (after phenylalanine or leucine), metases (methionine), and serases 
(after serine) (Rawlings, N.D. and A.J. Barrm (1994) Methods EnzymoL 244:19-61). 

Most mammalian serine proteases are synthesized as zymogens, inactive precursors that are 

10 activated by proteolysis. For example, trypsinogen is converted to its active form, trypsin, by 

etiiterc^>eptidase. Enteropeptidasc is an intestinal protease that removes an N-terminal fragment from 
trypsinogen. The remaining active fragment is trypsin, which in turn activates Use precursors of the 
other pancreatic enzymes. Likewise> proteolysis of prothrombin, the precursor of thrombin, generates 
three separate polypeptide fragments. Hie N-terminal fr^gnu^ is released while the other two 

15 fragm^its, which conoprise active thrombini, remain associated through disulfide bonds. 

The two largest SPsubfamiHes are the dbiyniotrypsin (SI) and subtilisin(S8)fam^ Some 
members of tbe chymotrypsin family contain two structural rtnmaing unique to this family. Kringie 
domains are triple-looped, disulfide cross-linked domains found in varying ccpy number. Kringles are 
thought to play a rde in binding mediators such as membranes, other proteins or phospholipids, and in 

20 the regulation of proteolytic activity (PROSITEPDOC00020). V^jple dcHnains are 90 amino-acid 
repeated domains, each containing six conserved cysteines. Three disulfide bonds link the first and 
sixtii, second and fifth, and third and fourth cystines (PROSrrEPDO^ Apple domains are 

involved in protdn-protein interactions. SI family members include trypsin, diynootrypsin, coagulation 
factors DC-Xn, conq>leQ]ent factors B, C, and D, granzymes, kallikrean, and tissue- and urokinase- 

25 plasminogen activators. The subtilisin family has members found in the eubactGria,arci^^ 

eukaryotes, and viruses. Subtilisins include the prc^otein-processing endc^)eptidases kexin and fuiin 
and the pituitary probonnone convertases PCI, PC2, PCS, PC6, and PACE4 (Rawlings and Barrett, 
supra) . The prolyl oligopeptidase (S9) family includes enzymes from prokaryotes and eukaryotes with 
greatiy differing specificities. Dipeptidyl peptidase IV (DPP-TV) is identical to CD26 aix) is inq>licated 

30 in the inactivation of peptide hormones, as well as in regulating T-ceD growth (reviewed in Kahne, T. 
al. (1999) Int. J, MoL Med. 4:3-15; Menflein, R. (1999) Regul. Pept 85:9-24): Inhibition of DPP-IV 
has beai suggested as a treatmoit for type 2 diab^es (Hoist, JJ. and C.F. Deacon (1998) Diabrtes 
47:1663-1670), and lowered serum DPP-IV activity has been measured in anorexia and bulimia 
patients (van West, D. et al. (2000) Eur. Arch. PsydL Clin. Neurosd. 250:86-92). 
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SPs have functioDS in many nannal processes and some have been inQ)licated in the etiology or 
treatment of disease. Enterokinase, the initiator of intestinal digestion, is fcmid in the intestinal 
border, where it cleaves tlie acidic propeptide from trypsinogai to yield active trypsin (Kitamoto» Y. et 
aL (1994) Proc. Natl. Acad. Sci. USA 91:7588-7592). Prolylcarboxyp^tidase, a lysosomal serine 

5 peptidase that cleaves pqptides such as angiotensin n and in and [des-Arg9] Iffadykimn, shares 

sequence homology with members of both the serii^ carboxypeptidase and prolylendopeptidase families 
• ' (Tan. F. et al. (1993) J. Biol. Chan. 268:16631-16638). The protease neuropsinmay influence 
syn2q)se formation and neuronal connectivity in the luppocanq>us in response to neural signaling (Chen, 
Z.-L. et al. (1995) J. Nairosci. 15:5088-5097). Tissue plasminogen activator is useful for acute 

10 management of stroke (Zivin, J.A. (1999) Neurology 53:14-19) and myocardial infarction (Ross, A.M. 
(1999) Clin. Cardiol. 22:165-171). Some recq)tors (PAR, for protdnase-activated receptOT). MgWy 
expressed throughout the digestive tract, are activated by proteolytic cleavage of an extraceQular 
domain The major agonists for PARs, thrombin, trypsin, and mast cefltryptase, are rd 
and inflammatory conctitions. Control of PAR activation by proteases has been suggested as a 

15 i^omising th^apeutic target (VergnoUe. N. (2000) Aliment PharmacoL Ther. 14:257-266; Rice, K.D. 
et al. (1998) Curr. Pharm. Des. 4:381-396). Prostate-spedfic antigen (PSA) is a kallikrein-like serine 
protease synthesized and secreted exdusivdy by epithelial ceils in the prostate gland. SerumPSAis 
devated in prostate cancer and is the most sensitive physiological marker for momtoring cancer 
progression and response to therapy. PSA can also identify the prostate as the cdgin of a metastatic 

20 tumor (Brawer, M.KL and P.H. Lange (1989) Urology 33:11-16). 

The signal peptidase is a spedalized class of SP found in all prokaryodc and eukaryotic cell 
types that serves in the processing of signal peptides from certain proteins. Signal peptides are 
amino-terminal domaing of a protein \iv1iich direct the protein from its ilbosomal assembly site to a 
particular ceiUular or extracdhilar location. Once the protein has been esqxxted, removal of die signal 

25 sequence by a signal p^tidase and posttranslational processing, e;g„ gjycosylation or phosphorylation, 
activate the proteiiL Signal peptidases exist as multi-subunit con^lexes in both yeast and mammals. 
The canine signal peptidase con^lex is composed of five subunits, aD associated witii the microsomal 
membrane and containing hydrophobic r^ons ttiat span the memtv^ne one or more times (Shelness, 
G.S. and G. Blobd (1990) J. Biol. Chm. 265 .-95 12-95 19). Some of these subunits serve to fix the 

30 complex in its proper position on the membrane while others contain the actual catalytic activity. 

Another family of proteases wtnct have a serine in their active site are depcodenit on the 
hydrolysis of ATP for their activity. These proteases contain proteolytic core domains and r^ulatcay 
ATPase domains whidi can be identified by the presrace of the P-loop, an ATP/GTP-bindii^ motif 
(PROSrrE PDOC00803). Members of this famOy inclutte the ewkaryotic mitodaondrial matrix 
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proteases, Clp protease and the proteasome. Clp protease was onginally found in plant cbloroplasts but 
is beMeved to be widespread in both prcrfcaryotic and eukaryoticc^ The geiie for early-onset torsion 
dystonia encodes a protein related to Clp protease (Ozelrus, L.J. et aL (1998) Adv, Neurol. 78:93-105). 
Hie proteasome is an intraceflular protease conq>lex found in some bacteria and in aU 
5 oikaryotic cdDs, and plays an inqxxtant role in cdlular physiology. Prc^easomes are associated with 
the ubiquitin conjugation syston (UCS), a major pathway for the d^adation of cellular proteins of aD 
types, including proteins that funcdon to activate or repress cellular processes such as transcription and 
cell cycle progression (Ciechanover, A. (1994) Cdl 79:13-21). In the UCS pathway, proteins targeted 
for degradation are coiyugated to ubiquitin, a small heat stable protein. The ubiquidnated protein is 

10 then recc^nized and d^aded by the proteasome. The resultant ubiquitin-pq)tideconq)lex is 

hydrolyzed by a ubiquitin cart>oxyl terminal hydrolase, and free ubiquitin is released for reutilization by 
the UCS. Ubiquitin-proteasome systems are in^licated in the degradation of mitodc cyclic kinases, 
oncc^oteins, tumor suppressor genes (p53), cdl surface receptors associated with signal transduction, 
transcriptional regulators, and mutated or damaged proteins (Ciechanover, supra) . This pathway has 

15 been implicated in a number of diseases, including cystic fibrosis, Angelman^ syndrome, and Liddle 
syndrome (reviewed in Schwartz, A.L. and A. Ciechanover (1999) Amni. Rev. Med. 50:57-74). A 
muriiffi proto-oncpgene, XJtsp, encodes a nucdear ubiquitin protease wbose overesqiressiQn leads to 
oncogenic transformation of NIH3T3 cells. The human homologue of this gezie is consistendy elevated 
in smaU cell tumors and adenocardnconas of the lung (Gray, D.A. (1995) Oncogene 10:2179-2183). 

20 Ubiquitin carboxyl terminal hydrolase is involved in the differentiation of a lymphoblastic leukemia cell 
line to a non-dividing mature state (Maki, A. et aL (1996) Differentiation 6059-66). In neurons, 
ubiquitin carboxyl terminal hydrolase (PGP 95) expression is strong in the abnormal stractures that 
occur in faunian neurodegenerative diseases (Lowe, J. et aL (1990) J. PathoL 161:153-160). The 
proteasome is a large (-2000 kDa) multisubunit complex conqposed of a c^itral catalytic core 

25 containing a variety of proteases arranged in four seven-menibered rings with ttie active sites facing 
inwards into the central cavity, and terminal ATPase subumts covering the outer port of the cavity and 
r^ulating substrate entry (fey review, see Schmidt, M. et aL (1999) Curr. Opin. Chan. BioL 3:584- 
591). 

Cysteine Proteases 

30 Cysteine proteases (CPs) are involved in diverse cellular processes ranging from the processing 

of precursor proteins to intracellular degradation Nearly half of the CPs kr^own are present only in 
viruses. CPs have a cysteine as the major catalytic residue at the active site where catalysis proceeds 
via a thioester intermediate and is facilitated by nearby histidine and asparagine residues. A gtutamine 
residue is also in^xHtant, as it be^s to form an oxyanion hole. Two in^xntant CP families include the 
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ps^ain-like enzymes (CI) and the ca^ains {C2). Ps^ain-like funily members are generally lysosomal 
or secreted and therefore are synthesized with signal pq}tides as Most members 

bear a conserved modf in thepropq>tide tiiat may have structural significance (Karrer, ILM, et aL 
(1993) Proc. Nad. Acad. ScL USA 90:3063-3067). HffecKfimensional structures of papain family 
members show a bflbbed molecule with the catalytic site located Papains 
include cathepsins B, C, H, L, and S, certain plant allergais and dipq)tidyl peptidase (for a review, see 
Rawlings, N.D. and A.J. Barrm (1994) Methods EnzymoL 244:461^6). 

Some CPs are expressed ubiquitously, \^^e others are produced only by cdls of the immn nft 
system. Of particular note, CPs are produced by monocytes, macrophages and other ceils vMch 
migrate to sites of inflammation and secrete molecules involved in tissue repair. Overabundance of 
these Tcpsir molecules plays a role in certain disorders. In autoimmune diseases such as rheumatoid 
arthritis, secreticm of the cysteine peptidase cathepsin C degrades collagen, laminin, elastin and other 
structural proteins fo«md in the extraceDular matrix of bones. Bone weakened by such degradation is 
also more susceptible to tunxH* invasion and metastasis. Cathq>sin L esqpressioa may also contribute to 
the influx of mononuclear cells which exacerbates the destructicHi of the rheumatoid synovium 
(Keyszer, CM. (1995) Arthritis Rheum. 38:976-984). 

Calpains are caloumrdepoKtent c3ftosolic endopeptidases \K^ch contain both an N*terminal 
catalydc domain and a C-tenninal calciuni-binding domain. Calpain is expressed as a proenzyme 
heterodimer consisting of a catalytic subunit unique to each isoform azKl a regulatory subunit common 
to different isoforms. Each subunit bears a calcium-binding EF-hand domain. The r^ulatory subunit 
also contains a hydrq}hobic glycine-rich domain that allows the enzyme to associate with ceiQ 
memlH*anes. CaJ^ains are activated by increased intracellular calcium concentration, v^ch induces a 
change in confonnation and limited autolysis. The resultant active molecule requires a lower cairiiim 
concentration fOT its activity (Chan, Si. aiK) MJ>. Mattson (1999) J. NeuroscL Res. 58:167-190). 
Calpain esq^ession is i^edominantly neuronal, although it is present in other tissues. Several chronic 
neurodegenerative disorders, including ALS, Parkinson's disease and Alzheimer's disease are 
associated with inCTeased calpain expression (Chan and Mattsoa supra) . Calpain-mediated breakdown 
of the cytoskeleton has been proposed to contribute to brain damage resulting from head injury 
(McCracken, R aL (1999) J. Neurotrauma 16:749-761). Ca^ain-3 is predominanOy expressed in 
skeletal muscle, and is responsible for limb-girdle muscular dystrophy type 2A (Minami, N. et al. 
(1999) J. Nttirol Sd. 171:31-37), 

Another family of thiol proteases is the caspases, which are involved in the initiation and 
execution phases of apoptosis. A pro-apoptotic signal can activate imtiatcx* caspases that trigger a 
proteolytic caspase cascade, leading to the hydrolysis of target proteins and the classic ^)Qptotic death 
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oftheceiL Two active site residues, a cysteine aiKl a Mstidine» liave been inqjM 

Tivyrhanigm Caspases are among tbe most specific endopeptidases, cleaving after aspartate residues. 

Caspases are synthesized as inactive zymogens consisting of one large (p20) and one small (plO) 



5 interacts with cofactors that can positively or negadvdy affect apqptosis. An activating signal causes 
autoproteolytic cleavage of a specific aspartate residue (D297 in the caspas&-l numbering convendon) 



caspase family members have been shown to promote dimenzation and auto-processing of procaspases. 



reenter con^)le]c In addition, two diniers from different caspase fanifly inembers can asso 
changingthesubstratespecificity of the resultant tetramer. Endogenous caspase ixihibitors (inhibitor of 
apcptosis proteins, or lAPs) also exist AU these interactions have clear effects on the control of 

15 apoptosis (reviewed in Chan and Mattson, supra : Salveson, G.S. and VM. Dixit (1999) Proc. Natl. 
Acad. ScL USA 96:10964-10967). 

Caspases have been innplicated in a number of diseases. Mice lacking some caspases have 
severe nervous systm defects due to failed zqx^ptosis in the neuroepithe^um and suffer early lethality. 
Others show severe defects in the inflammatory response, as caspases are responsible for {Hocessing IL- 

20 lb and possibly odier infiammatory cytokines (Chan and Mattson, supra) . Cowpox virus and 

baculoviruses target caspases to avoid the death of tbeir host ceil and promote successful infection. In 
addition, increases in inappropriate s^Kiptosis have been r^KRted in AIDS, neurodegenerative diseases 
and ischemic injury, whfle a decrease in c^ death is associated with cancer (Salveson and Dixit, supra : 
ThD3iq)son, C.B. (1995) Sdeoce 267:1456-1462). 

25 Aspartvl proteases 

Aspartyl proteases (APs) include the lysosomal proteases cathepsins D and H, as weiD as 
chymosin, renin, and the gastric pepsins. Most retroviruses encode an AF, usually as part of the sgl 
polyprotein. APs, also called acid proteases, are monomeric enzymes consisting of two domains, each 
domain containing one half of the acdve site with its own catalytic aspartic acid residue. APs are most 

30 active in tte range of pH 2-3, at which one of the aspartate residues is ionized and the other neutral. 
Thepq)sinfamfly of APs contains many secreted enzymes, and all are likdy to be synthesized with 
signal pqjtides and prcpqjtides. Most family members have three disulfide loops, the first -5 residue 
loop following the first aspartate, the second 5-6 residue loop preceding the second aspartate, and the 
third and largest loop occurring toward the C terminus. Retropq>sins, on the other hand, are analogous 



subunit s^arated by a small spacer region, and a variable N-terminal ; 
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to a Single doKDain of pepsin, and become active as booKxlimers with each retropepsin 
contnbutingonehalf of the active sit& Retropepsins are required fcr processing the viral polyproteins. 

APs have roles in various tissues, and some have been associated with disease. Renin mediates 
the first step in processing the bonnone angiotensin, vdiich is responsible for regulating electrolyte 
5 balance and blood pressure (reviewed in Crews, and S.R. Williams (1999) Huhl BioL 71 :475- 
503). Abnormal regulation and expression of cathepsins are evident in various inflammatory disease 
states. Expression of cath^sin D is elevated in synovial tissues from patients with rheumatoid arthritis 
and osteoartbritis. llieiiKS'eased expression aziddifferendalr^ulation of the cathepsins are liiil^ 
tbe metastatic potential of a variety of caiK:ers (Chambers, AF. et al, (1993) Grit. Rev. QncoL 
10 4:95-114). 

Metalloproteases 

MetaBoproteases require a metal ion for activity, usually manganese cr zinc. £xan^>les of 
manganese metalloenzymes include amincpeptidase P and human proline dipeptidase (PEPD). 
Aminppeptidase P can d^ade bradykinin, a non^>eptide activated in a variety of inflammatory 

15 responses. Amiiiopq>tidase P has been imphcated in coronary ischemia^eperfusion injury. 

Administration of aminopeptidase P inhibitors has been shown to have a cardioprotective eCTect in rats 
(Ersahin, C. et al (1999) J. Cardiovasc. Pharmacol. 34:604-61 1). 

Most zinc-dependent metalloproteases share a common sequence in the zinc-binding domain. 
The active site is made up of two histidines which act as zinc ligands and a catalytic glutamic add C- 

20 terminal to the first histidine. Proteins containing this signature sequence are known as the metzincins 
and include amincp^>tidases B and N, angiotensin-ccHiverting enzyme, neurolysin, tbe matrix 
metalloproteases and the adamalysins (ADAMS). An alternate sequence is found in the zinc 
carboxyp^tidases, in which an three conserved residues - two histidines and a glutamic acid — are 
involved in zinc binding. 

25 A number of the neutral metaUoendopqptidases, including angiotensin converting enzyme and 

the amincpqptidases, are involved in the metabolism of peptide hormones. High aminop^tidase B 
activity, for example, is found in the adrenal glands and neurohypc^yses of hypertensive rats (Prieto, 
I. et aL (1998) Hcan. Metab. Res. 30:246-248). Oligopeptidase M/neurolysin can hydrolyze 
Inradyfcinin as w^ as neurotensin (Serizawa, A. et aL (1995) J. Biol. Chem 270:2092-2098). 

30 Neurotensin is a vasoactive p^tide that can act as a neurotransmitter in tbe brain, where it has been 
implicated in limiting food intake (Tritos, N. A et aL (1999) Neuropq>tides 33:339-349). 

The matrix metallc^oteases (MMPs) are a family of at least 23 enzymes that can degrade 
components of the extracellular matrix (ECM). They are Zn*^ endopq>tidases with an N-terminal 
catalytic domaiiL Nearly all members of the family have a hinge peptide and C-terminal domain ^^db 
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can bind to substrate molecules in the ECM or to inhibitors produced by the tis sue (TIMPs » far tissue 
inhibitor of metalloprotease; Canq)bdl, I JL. et aL (1999) Trends Neurosci. 22:285). Hie presence of 
fibronectinrlike rq)eats, transmembrane domains, or C-terminal heanopexinase-like domains can be used 
to separate MMPs into coUagenase, gclatinase» stromeiysin and membran&-type MMP subfamilies. In 
5 the inactive form, the Zn^^ ion in the active site interacts with a cysteine m Activating 
factors disrupt tiie Zn*^-cysteine interaction^ or "cysteine switch," exposing the active site. This 
partially activates the enzyme, wfaidi then cleaves off its prcpeptide and becomes fully active. MMPs 
are often activated by the serine proteases plasnun and fiirin. MMPs are often r^ulated by 
stoichiometric, noncovaleot interacticxis with inhibitors; the balance of protease to inhilHtor, then, is 

10 very in^xxtant in tissue homec^tasis (reviewed in Yong, V.W. et al. (1998) Trends Neurosci. 21 :75). 
Eblers-Danlos syndrome type Vn C is caused by mutations in tt^ procollagen I N-proteinase gene 
(Colige. A. et al. (1999) Am. J. Hmn. Geaiet 65:308-317). 

MMPs are implicated in a number of diseases including osteoarthritis (Mitchefl, P. et aL (1996) 
J. Clin. Invest 97:761), atiierosderotic plaque rupture (Suldu)va, G.K. et aL (1999) Circulation 

15 99:2503), acHtic aneurysm (Scbneiderman, J. et al. (1998) Am. J. Path. 152:703), non-healing wounds 
(Saarialho-Kere, U.K. dt al, (1994) J. Clin. Invest 94:79). bone rescsption (Blavier, L. and J.M. 
Delaisse (1995) J. Cell Sci. 108:3649), age-related macular degeneration (Steen, B. di al. (1998) Invest 
Ophthalmol. Vis. Sci. 39:2194), emphysema (Finlay. G.A. et al. (1997) Thorax 52:502), myocarcfial 
infarction (Rohde, L.E. et al, (1999) Circulation 99:3063) and dQated cardiomyopathy (Thomas, C.V. 

20 €t al. (1998) Circulation 97:1708), MMP inhibitors fnrevent metastasis of mammary carcinoma and 
esq)erimental tumors in rat, and Lewis limg carcinoma, hemangioma, and human ovarian carcinoma 
xenografts m mice (Eccles, S.A. el al. (1996) Cancer Res, 56:2815; Anderson et aL (1996) Cancer Res. 
56:715-718; Volpert, O.V. et al. (1996) J. Clin Invest 98:671; Taraboletti, G. et aL (1995) J. NCI 
87:293; Davies, B. est al. (1993) Cancer Res. 53:2087). MMPs may be active in Alzheimer's disease. 

25 A immber of MMPs are implicated in multiple sclerosis, and administration of MMP inhibitors can 
relieve some of its synq>toms (reviewed in Ycmg, supra) . 

Anotherfamily of metalloproteasesistheADAMs, for A I>isint^rtn and Metalloprotease 
Domain, which they share with their close relatives the adamalysins, snake venom metalloproteases 
(S VMPs). ADAMs combine features of both cell surface adhesion molecules and proteases, containing 

30 a prodomaio, a protease domain, a disintegdn domain, a cysteine rich domain, an epidermal growth 

factor repeat, a transmembrane domain, and a cytoplasmic tafl. The first three domains listed above are 
also found in the S VMPs. The ADAMs possess four pot^itial functions: proteolysis, adhesion, 
signaling and fusion. The AD AMs share the metzincin zinc binding sequ^nre and are inhibited by some 
MMP antagonists such as TIMP-1. 
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ADAMs are inq)licatecl in such processes as spean-egg biDding and fusion, myoblast fusion, 
audprotdn-ectodomainiK'Ocessiiig or shRclding of cytokines, cytokine recqxtors, adbesion proteins and 
other ejctracdlular protdn domains (ScWckKtaff, J. and CP. Blbbd (1999) J. CeQ. ScL 112:3603- 
3617). The Kuzbanian protein cleaves a substrate in the NOTCH pathway (possibly NOTCH itselQ, 
activating the program for lateral inhibition in Drosophila neural development Two ADAMs, TACE 
(ADAM 17) and ADAM 10, are proposed to have analogous roles in theprocessixig of amyloid 
precursor protein in the brain (ScfalondOTff and Blobel. supra) . TACE has also been identified as tbe 
TNF activating meymc (Black, R.A. et aL (1997) Nature 385:729). TNF is a pleaotropic cytokine that 
is important in mobilizing host defenses in response to infection or trauma, but can cause severe 
damage in excess and is often overproduced in autoimmune disease. TACE cleaves niexnforan&-bound 
pro-TNF to release a soluble form. Other ADAMs may be involved in a sinular type of processing of 
other meaDDhrane-bound molecules. 

The ADAMTS sub-family has all of the features of ADAM family metalloproteases and 
contain an additional thromb06p(»Qdin domain (TS). The prototypic ADAMTS was identified in mouse, 
found to be expressed in heart and kidney and upregiilated hy p minflaTnTna triry jctimnii (K^no, K. ^ al. 
(1997) J. Biol. ChssxL 272:556-562). To date eleven members are recognized by the Human GoKMne 
Organization (HUGO; hdpy/www,gme.ucl.ac.uk/users/hester/adainls.htnil#Approved). Members of 
this f amfly have the ability to degrade aggrecan, a high molecular wei^ proteoglycan which provides 
cartilage with inqxMtant mechanical properties including compressibility, and which is lost during the 
development of arthritis. Enzymes ^^cdid^rade aggrecan are thus considered attractive targ^ to 
prevent and slow the degradation of articular cartilage (See, e.g., Tortordla, M.D. (1999) Sdeoce 
284:1664; Abbaszade, I. (1999) J. BioL Chem. 274:23443). Other membCTS are reported to have 
antiangiogenic potential (Kuno et aL . supra^ and/or procollagen processing (Cdige, A. et aL (1997) 
Proc. NatL Acad. ScL USA 94:2374). 

The discovery of new proteases and the polynucleotides encocfing them satisfies a need in tiie 
art by providing new compositions v/bich are useful in hydrolysis of peptide bonds and in the diagnosis, 
prevention, and treatment of gastrointestinal, cardiovascular, airtnimnniTWinflammfltnry^ ceil 
proliferative^ developmental, q)ithelial, neurological and reproductive disorders, and in the assessment 
of the effects of exogenous confounds on the esqiression of nucleic acid and amino add sequences of 
proteases. 



SUMMARY OF THE INVENTION 

Ttte invention features purified polypeptides, proteases, referred to collectively as 'TRTS*' and 
individually as '"PRTS-l," "PRTS-2," *TRTS-3," •TRTS-4," *TRTS-5." *TRTS-6." *TRTS-7/* 
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"PRTS-8," "PRTS-9/* *TRTS-10," "PRTS-11 " 'TRTS-IZ," *TRTS-13/' *TRTS-14/* "TRTS-IS," 
•TRTS-16 " 'TRTS-l?," 'TRTS-IS/' "PRTS-19/* "PRTS-20," and *PRTS-21." In gob aspect, the 
invention provides an isolated polypeptide sdected from the group consisting of a) a polypeptide 
conqirising an amino acid sequence sheeted from the group consisting of SEQ ID NO:l-21, b) a 
5 polypeptide conoprising a naturally occuning amino acid sequence at least 90% idesitical to an amino 
acid sequence selected from the group consisting of SEQ ID NO:l-21, c) a biologically active fragment 
of a polypeptide having an amino acid sequence sheeted from the group consisting of SEQ ID NO:l- 
21, and d) an inmmnogenic fragment of a polypeptide having an amino acid sequeax:esele 
group consisting of SEQ ID NO:l-21. In one alternative, the invention provides an isolated polypq>tide 

10 conprising the amino acid sequence of SEQ ID NO: 1-21 . 

The invention further provides an isolated polynucleotide encoding a polypeptide selected from 
the group consisting of a) a polypeptide conq>rising an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-21, b) a polypq)tide conqnising a naturally occurring amino add sequence 
at least 90% idratical to an amino add sequence selected from the group consisting of SEQ ID NO:l- 

15 21, c) a biologicaUy active fragm^ of a polypq>tide having an amino add sequence selected from the 
group consisting of SEQ ID NO:l-21 , and d) an immunogeauc fragm^ of a polypeptide having an 
amino add sequence sdected from the group consisting of SEQ ID NO: 1-21. In one alternative, the 
pdlynudeotide encodes a polypeptide selected from the group consisting of SEQ ID NO:l-21. In 
another alternative, the polynudeotide is selected from the group consisting of SEQ ID NO:22-42. 

20 Additionally, the invention provides a recombinant polynudeotide conqnising a promoter 

sequence operably linked to a polynudeotide encoding a polypeptide selected from the group consisting 
cf a) a polypeptide compdsii^ an amino add sequence selected from the group consisting of SEQ ID 
NO:l-21, b) a polyp^tide conc^rising a naturally occurring amino add sequence at least 90% identical 
to an amino add sequence sdected from the group consisting of SEQ ID NO: 1-21, c) a bidLogically 

25 active fragmoit of a polypeptide having an amino add sequence sdected from the group consisting of 
SEQ ID NO:l-21, and d) an iimnunogenic fragment of a polypeptide having an amino add sequence 
selected from the group consisting of SEQ ID NO:l-21. In one alternative, the invention jn-ovides a cdl 
transformed with the recombinant polynudeotide. In another alternative, the invention provides a 
transgenic organism comprising the recombinant polynucleotide. 

30 The invention also provides a method for producing a polypeptide sdected from the group 

ccHisisting of a) a polypeptide conqnising an amino add sequence sdected from the group consisting of 
SEQ ID NO:l-21, b) a polypeptide con5)rising a naturally occurring amino add sequence at least 90% 
identical to an amino add sequence sdected from the group consisting of SEQ ID NO: 1-21, c) a 
biologically active fragroent of a polypeptide having an amino add sequence sdected from the group 
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CCHlSiStingOf SBQIDNO:l-21, andd) an iTniniinngftnir fragment of a polypqittidft having an amino 

acid sequence selected from the grcHip consisting Of SEQ ID N^^ The method compdses a) 
culturing a cell under conditioiis suitable for e?^n:essioa of the polypeptide* wherein said ceQ is 
transfbrxned with a recombinant polymdeotide con^^^ apronKTtersequaiceoperably linked to a 
5 polynucleotide encoding the polypeptide, and b) recx>venng the polypqitide so esqiressed. 

Additionally, the invention provides an isolated antibody Ms^ch spedficaBy binds to a 
• - polypeptide selected from the group coosisting of a) a polypeptide comprising an amino add sequence 
selected from the group consisting of SEQ ID NO:l-21, b) a polypeptide compnsing a naturally 
occurring amino add sequoice at least 90% identical to an amino add sequence selected from frie group 

10 consisting of SEQ ID NO:l-21, c) a biologically active fragment of a polypeptide having an amino add 
sequence selected from the group consisting of SEQ ED NO: 1-21, and d) an immunogenic fragnoent of a 
polypeptide having an amino add sequence selected from the group consisting of SEQ ID NO: 1 -21 . 

The invention further provides an isolated polynucleotide sheeted from the group consisting of 
a) a polynudeotide comprising a polynucleotide sequence sdected from the group consisting of SEQ ID 

15 NO:22-42, b) a polynudeotide comprising a naturally occurring polynudeotide sequence at least 90% 
identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:22-42, c) a 
polynucleotide complementary to the polynudeotide of a), d) a polynucleotide conq>lementary to the 
polynucleotide of b), and e) an RNA equivalent of a><]). In one alternative* the polynudeotide 
conqsrises at least 60 contiguous nucleotides. 

20 Additionally, the invention provides a method for deeding a target polynudeotide in a san^le, 

said target polynucleotide having a sequence of a polynudeotide selected from the group consisting of 
a) a polynucleotide con^rising a polynudeotide sequence selected from the group consisting of SHQ ID 
NO:22-42, b) a polynucleotide conqinsing a naturally occurring polynudeotide sequence at least 90% 
identical to a polynucleotide sequence sdected from the group consisting of SEQ ID NO:22-42, c) a 

25 polynucleotide complmientary to the polynudeotide of a), d) a polynucleotide conq>lea£ntary to the 
polynudeotide of b), and e) an RNA equivalent of a>-d). The method conqnlses a) hybridizing the 
sanq>le with a probe conqmsing at least 20 contiguous nucleotides conqirisii^ a sequence 
conq)lemaitary to said target polymideotide in the sample, and which probe specifically hybridizes to 
said target polynucleotide, under conditions whereby a hybridization conq>lex is formed between said 

30 probe and said target pcdynudeotide or fragments thereof, and b) detecting the presence or absence of 
said hybridization complex, and optionally, if present, the amount thereof. In one alternative, the probe 
con^rises at least 60 contiguous nudeotides. 

The invention further provides a method for detecting a target polynudeotide in a sample, said 
target polynudeotide having a sequence of a polynucleotide selected from the group consisting of a) a 
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polynucleotide conqTrising a polynucleotide sequence selected from the group consisting of SEQ ED 
NO:22-42, b) a polynucleotide aH^^)rising a naturalLy occurring polynucleotide sequence at least 90% 
identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:22-42, c) a 
polynucleotide conoplementary to the polynucleotide of a), d) a polynudeotide con^leanentary to the 
polynucleotide of b), and e) an RNA equivalent of a)-€). The method ccnqirises a) amplifying said 
target polynucleotide or fragment thereof using polymerase chain reaction amplifrcation, and b) 
detecting the presence or absence of said amplified target polynucleotide or fragment thereof » and, 
optionally, if present^ the amount thereof. 

The invention further provides a composition comprising an effective amount of a polypeptide 
selected fix)m the group consisting of a) a polyp^>tide compising an amino acid sequence selected from 
the group consisting of SEQ ID NO: 1-21 , b) a polypeptide comprising a naturally occurring amino add 
sequence at least 90% identical to an amino add sequence selected from the group consisting of SEQ 
ID NO;l-21, c) a biologically active fragment of a polypeptide having an amino add sequence selected 
from the group consisting of SEQ ID NO:l-21, and d) an immunogenic fragmem of a polypeptide 
having an amino add sequence selected from the group consisting of SEQ ID NO:l-21, and a 
pharmaceutically acceptable exdpient In one embodiment, the conqx)sition comprises an amino add 
sequence selected from the group consisting of SEQ ID NO:l-21. The invention additionally provides a 
method of treating a disease ot condition associated with decreased expression of functional PRTS, 
comprising administering to a patient in need of such treatment the con^Kisition. 

The invention also provides a method for scre^iing a compound for effectiveness as an 
agonist of a polypeptide selected from the group consisting of a) a polypq>tide conqxising an amico 
add sequ^ice sdected from the group consisting of SEQ ID NO: 1-21 , b) a polypeptide comprising a 
naturally occurring amino add sequence at least 90% identical to an amino add sequence selected from 
the group consisting of SEQ ID NO:l-21, c) a biologically active fragment of a polypq>tide having an 
amino add sequence selected from the group consisting of SEQ ID NO:l-21, and d) an immunogeanc 
fragment of a polypeptide having an amino add sequence selected from the group consisting of SEQ ID 
NO:l-21. The method co^^sises a) exposing a sample comprising the polypeptide to a coixQX)und, 
and b) detecting agonist activity in the sample. In one alternative, the invention provides a 
composition comprising an agonist compound identified by the method and apharmaceuticaLly 
acc^table exdpient. In another alternative, the invention provides a method of treating a disease or 
condition assodated with decreased expression of functional PRTS, comprising administering to a 
patient in need of such treatment the composition. 

Additionally, the invention provides a method for screening a compound for effectiveness as 
an antagonist of a polyp^tide selected from the group consisting of a) a polypeptide comprising an 
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sdected from the group consisting of SEQ ID NO;l-21, c) a biologically active fragment of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ED NO;l-21, and 
5 d) an imniunogenic fragment of a polypeptide having an amino add sequ»ices€^ect^ 

consisting of SEQ ID NO:l-21. Hie mettiod comprises a) exposing a sample comprising the 



invention provides a coni^sition co^^)rising an antagonist compound identified by the method and a 



10 treating a disease or condition associated witii oveanecqiiession of functional PRTS, comprising 
administering to a patient in need of such treatment ttie composition. 

The invention further provides a method of screoiing for a compound that specifically binds 



. amino acid sequence selected from the group consisting of SEQ ID NO;l-21, and d) an inmmnogenic 

fragment of a polypeptide having an amino acid sequence selected from the group consisting of SBQ ID 
. NO:l-21. The method comprises a) combining the polyp^itidewifli at least one test conq^ 

20 suitable conditions, and b) detecting binding of the polypeptide to the test compound, thereby 
identifying a componnd that specifically binds to the polypeptide. 

Tbe invention further provides a method of screening for a compound that modulates the 
activity of a polyp^tide sdected from the group consisting of a) a polypeptide con^2risi^g an amino 
acid sequence sdected from the group consisting of SEQ ID NO:l-21, b) a polypeptide con^sing a 

25 naturally occurring amino acid sequence at least 90% identical to an amino add sequence selected fixun 
the group consisting of SEQ ID NO;l-21, c) a biologically acdve fragment of a polypeptide having an 
amino add sequence sdected from the group consisting of SEQ ID NO:l-21, and (f) an immunogenic 
fragm^ of a polypeptide having an amino add sequence sdected from the group consisting of SEQ ID 
NO:l-21 . The method comprises a) combining the polypeptide with at least one test compound under 

30 conditions pemissive for the activity of the polypeptide, b) assessing tide activity of the polypeptide 
in the presence of the test compound, and c) comparing the activity of the polypeptide in the presence 
of tiie test compoimd with the activity of the polypeptide in the absence of the test compound, 
wherein a change in the activity of the polyi)eptide in the presence of the test conqxTund is indicative 
of a compound that modulates the activity of the polyp^>tide. 



polyp^tide to a conqx}und, and b) detecting antagonist activity in the sample. In one alternative, the 
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The invention further provides a method for screening a con^und for effectiveness in 
altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
sequence selected from the group consisting of SEQ ID NO:22-42, the method con^jrising a) 
exposing a sanq)ie conqirising the target polynucleotide to a compound, and b) detecting alt^ed 
S expression of the target polynucleotide. 

Hie invention further provides a method for assessing toxicity of a test compound, said 
method comprising a) treating a biological sanQ>le containing nucleic acids with the test compound; 

b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 
contiguous imcleotides of a polynucleotide selected from the group consisting of i) a polynucleotide 

10 comprising a polyruicleotide sequence selected from the group consisting of SEQ ID NO:22-42, ii) a 
polynucleotide comprising a naturally occurring polyrmcleotide sequence at least 90% identical to a 
polynucleotide sequence selected from the group oonsistiag of SEQ ED NO:22-42, iii) a 
polynucleotide having a sequence complementary to i), iv) a polynudeotLde complementary to the 
polynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridization occurs under conditions 

15 whereby a specific hybridization complex is formed between said proXxi and a target polynucleotide in 
the biological sample, said target polynucleotide selected from the group consisting of i) a 
polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID - 
NO:22-42, ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% 
identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:22-42, iii) a 

20 polynucleotide complementary to the polynucleotide of i), iv) a polynucleotide complementary to the 
polynucleotide of ii), and v) an RNA equivalent of i)-iv). Alternatively, the target polynucleotide 
comprises a fragm^ of a polynucleotide sequence selected from the groiq> consisting of i>v) above; 

c) quantifying the amount of hybridization complex; and d) conq>aring the amount of hybridization 
complex in the treated biological sample with the amount of hybridization complex in an untreated 

25 biological sample, wherein a difference in the amount of hybridization complex in the treated 
biological san^>le is indicative of toxicity of the test con^>ound. 

BRIKF DESCRIPTION OF THE TABLES 
Table 1 summarizes the nomenclature fcr the fun length polynucleotide and polypeptide 
30 sequences of the present inventioiL 

Table 2 shows the GenBank identLfication number and annotation of the nearest GenBank 
homolog for polypqptides of the invention. The probability sc^e for the match b^ween each 
polyp^tide and its GenBank homolog is also shown. 
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Tables shows structural features of polypeptide sequoices of the invexxtion, imduding predicted 
motiis and domains, aloi^ with the m^ods, algorithms, and searchable databases used fcr analysis of 
the polypq>tides. 

Table 4 lists the cDNA and/or gencxmc DN A fragments Mviiich were used to assemble 
polynucleotide sequences of the invention, along with selected fragments of the polynucleotide 
sequences. 

Table 5 shows the repres^itative cDNA library for polynudeotides of tiie invention. 

Table 6 provides an appendix wMch describes the tissues aiKi vectors used for construe 
the cDN A libraries shown in Table 5 . 

Table 7 shows tbe tools, programs, and algoiitfazns used to analyze the polynucleotides and 
pdyp^des of the invention, along with applicable descriptions, references, and threshold parameters. 



DESCRIPTION OF THE INVENTION 

Before the jjresent protons, nucleotide sequences, and methods are described, it is understood 
that this invCTtion is not limited to the particular machines, materials and methods described, as these 
mayvary. It is also to be understood that the terminology used herein is for the purpose of describi^ 
particular embodiments only, and is not intfaTried to limit toe scope of the present invention which vAU. 
be limited only by the upended claims. 

It must be noted that as used herein and in the appended claims, the singular forms "a," "an," 
and "the" include plural reference unless the context dearly dictates otherv^ Thus, for eMnq)le, a 
reference to "a host ceil" includes a plurality of such host ceHs, and a referoice to "an antibody" is a 

reference to one ormcnre antibodies and equivalents thereof known to those skilled in ^ art, and so 
forth. 

Unless d^ned otiberwise, an technical and sdoitiftcternis used her^ have tl^ same meanings 
ascommonly understood by one of CM^dmarysldn in the art to v^ch^ Altiiough 
any machines, materials, and niethods similar or equivalent to tiK)se described herein can be used to 
practice ot test the presait invraition, the preferred machines, materials and methods are now described. 
AH publications mentioned herein are cited for the purpose of describing and disclosing the ceil lines, 
protocols, reagents and vectors lAiiich are reported in the publications and which might be used in 
connection with the inventicHi. Nothing herein is to be construed as an adnoission that tte invention is 
not entitled to antedate such disclosure by virtue of prior invmtion. 
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DEFINITIONS 

'TRTS" refers to the amino acid seqaences of substantiany purified PRTS obtained from any 
spedes, particularly a maTnmalian species* including bovine, ovine, porciiie, murine, ecpme, and 
5 tmrnan, and from any source, whether natural, synthdic, semi-synthetic, or recombinant 

The term "agonist" refers to a molecule which intensifies cm- nmmcs the biological activity of 
PRTS. Agonists may include proteins, nudLeic acids, carbohydrates, small molecules, or any other 
compound or composition which modulates the activity of PRTS either by directly interacting with 
PRTS or by acting on components of ttie biological pathway in vMdi PRTS participates. 
10 An"alleilic variant" is an alternative form of the gene encoding PRTS. AUdic variants may 

result from at least one mutation in the nucleic add sequence and may result in altered mRN As or in 
polypeptides whose structure or function may or may not be altered. A gene may have none» one, or 
many allelic variants of its naturally occurring form Common mutational changes v^ch give rise to 
allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
15 Each of these types of changes may occur alone, or in combination with the others, one or UKre times in 
a given sequence. 

"Altered" nucleic add sequences OMXxling PRTS include those sequences with deletions, 
insertions, or substitutions of different nucleotides, resulting in a polypeptide the san^ as PRTS or a 
polypq>tide with at least cme functional characteristic of PRTS. Included within this defimtion are 

20 polymorphisms v^ch may co- inay not be readily detectable using a particular oUgomdectide prc^ 
tbepolynudeotide encoding PRTS, and improper or imexpected hybridization to allelic variants, witha 
locus other than the normal chromosomal locus for the polynucleotide sequence en^ The 
encoded inxxdn may also be "altered," and may contain deletions, insertions, or substitutiois of amino 
add residues )^ch produce a silent change and result in a fimctioiiallyequival^ Deliberate 

25 aminoaddsubstitutions nmybemadeontibebasisof similarity in polarity, c^ solubility, 

hydrophobidty, hydrc^hilidty, and/or ttie an^hipathic nature of the residues, as long as th& biological 
OT immunological activity of PRTS is retained. For example, negatively charged amino adds may 
indude aspartic add and gflutamic add, and positively charged amino adds may indude lysine and 
arginine. Amino adds with undiarged polar side chains having similar hydrophilidty values may 

30 indude: asparagine and glutamine; and serine and threonine. Amino acids with uncharged side chains 
having similar hydrophilicity values may include: leucine, isoleucine, and valine; glycine and alanine; 
and phenylalanine and tyrosine. 

The terms "amino add" and "amiiK) add sequence" refer to an olig(^)eptide, pq>tide, 
polypqjtide, or prmein sequence, or a fragment of any of these, and to naturally occurring or synthetic 
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molecules. Where ''amino add sequence" is redted to refer to a sequence of a naturally occurring 
protein molecule* "amino add sequence^' and like terms are not meant to limit the amino add sequence 
to the complete native amino add sequ^ice assodated witti the recited protein molecule. 

"AmplificaticHi" rdates to tbe production of additional copies of a nuddc add sequence. 
Anq)lificadon is generaOy carried out using polymerase chain reaction (PGR) technologies wcil known 
in the art 

The term "antagonist" refers to a molecule which inhibits or atfmua tps the biological activity of 
PRTS. Antagonists may indude proteins such as antibodies, nucldc adds, carbohydrates, small 
molecules, or any other compound or ooix^x)sition vMch modulates the activity of PRTS dther by 
directty interacting with PRTS or by acting on conq)on^its of the biological pathway in wtuch PRTS 
partidpates. 

The term * ' antibody" refers to intact immunoglobulin molecules as wdl as to fragmoits theareof , 
such as Fab, F(ab')2, and Fv fragmoits, vsiuch are capable of binding an epitopic determinant 
Antibodies that bind PRTS polypeptides can be prq>ared using intaa polypeptides or using fragments 
contaming small peptides of interest as the immunizing antigen. The polypq>tide cr oligopeptide used 
to immunize an animal (e. g. , a mouse, a rat, or a rabbit) can be derived from the translatiosi of RN A, or 
synthesized chenucaUy, and can be conjugated to a carrier protein if ctesired. Commonly used carriers 
that are ch^cally coupled to peptides indude Ixivine senmi albumm, thyrc^ 
limpet hemocyanin (KLH). The coupled peptide is then used to immimize tiae animal 

The term "antigenic determinant" refers to that r^on of a nx^iecule (Le., an epitope) that 
makes contact wim a particular antitxxty. Whai a protein (H- a fragment of a protein is used to 
immnni7e a host animal, numerous r^cHis of the protdn may induce the production of antilxxlies which 
bind specifically to antigenic determinants (particular regicHis ot three-dimensional structures on the 
pa-oteinX An antigenic determinam may C(»r9)ete with the intact antigen (i.e., the in^ 
didt the inmnme response) for binding to an antibody. 

The term "antisense" refers to any compositicHi capable of base-pairing witii the "sense" 
(coding) strand of a specific nucldc add sequence. Antisense con^)ositions may indude DNA; RNA; 
peptide nucldc add (PNA); oligonudeotides havii^ modified backbcme Imkages sudi as 
phosphorothioates, mi^ylphosphonates, ot benzylphosphcHiates; oligonudeotides having modified 
sugar groups such as 2 -methoxyethyl sugars or 2'-medKMcyethoxy sugars; or oligonucleotides having 
modified bases such as 5-m^yl cytosine, 2*-deoxyuradl, ot 7-<leaza-2*-deoxyguanosine. Antisrase 
molecules may be produced by any method including chemical synthesis OTtra^^ Once 
introduced into a cdl, the complementary antisense molecule base-pairs with a naturally occurring 
nucldc add sequ^ice produced by the cdl to fonn duplexes which block either transcription ot 
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translation. Hie designation "n^ative" or '•minus" can refer to the antisense strand, and the 
designation •'positive" cff •'plus" can refer to the sense strand of a reference DNA molecule. 

The term "biologically active" refers to a protein having structural, regulatory, or biochemical 
functions of a naturally occurring molecule. Likewise, "immunologically active" cr "immunogenic" 
refers to the c^ability of the natural, recombinant, c»- synthetic PRTS , cr of any oligcpeptide thereof, 
to induce a specific immune response in ^propriate animals or cells and to bind with specific 
antibodies. - 

••Conq)lemeutary" describes the relationship between two single-stranded nucleac acid 
sequences that aimeal by base-pairing. For exaii5>le, 5 -AGT-3' pairs with its compleanemt, 
3*-TCA-5'. 

A "composition ccanprising a given i>olymicleotide sequence" and a "composition conqjrising a 
givoi amino acid sequaice" refer broadly to any composition containing the given polynucleotide or 
amino acid sequaK:e. The conqxKition may comprise a dry fmnulaticm or an aqueous so 
Conq)ositions conq}rising polynucleotide sequences encoding PRTS or fragments of PRTS may be 
enq>loyed as hyMdization probes. The (H^obes may be stored in Aeeze-dried form and may be 
associated with a stabilizing agent such as a carbohydrates In hybridizations, the probe may be 
deployed in an aqueous solution containing salts (e.g., NaQ), d^gents (e.g., sodium dodecyl sulfate; 
SDS), and other conqx>nents (Cvg., Oenhardf s solution, dry milk, salmon sperm DNA etc.). 

"Consensus sequence" refers to a nucleic acid sequence vviuch has been subjected to repeated 
DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied Biosystems, 
Foster City C A) in the 5* oiia/or the 3* direction, and resequenced, or v/tnch has been assembled fi-om 
one or more overlapping cDNA, EST, or genomic DNA firagmCTts using a conq)uter program for 
fiagmrat assanbly, sucii as the GELVIEW fragnoent assembly system (GCG, Madison WI) cr Phrap 

(University of Washingt<Hi,SeataeWA). Some sequences have been b<^ exteaided and assembled to 
produce the consensus sequence. 

"Conservative amino acid substitutions" are those substitutiCHis that are predicted to least 
interfere with the properties of the original protein, le., the structure and especially the function of the 
protdn is conserved and not significantly <dianged by such substitutions. The table b^ow shows amino 
adds which may be substituted for an original amino acid in a protein and vMch are r^arded as 
conservative amino acid substitutions. 



Original Residue 



Conservative Substitution 



Ala 
Arg 
Asn 
Asp 



Asp, Gin, His 
Asn, Glu 



Gly, Ser 
His, Lys 
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Cys 

Gin 

Ghi 

Gly 

His 

ne 

Leu 

Lys 

Met 

Phe 

Ser 

Thr 

Trp 

Tyr 

Val 



Ala,Ser 
Asn, Glu. His 
Asp. QjQ, His 
Ala 

Asn, Arg, do, Glu 
Leu, Val 
ne, Val 
Arg, Gin, Ghi 
Leu, De 

His, Met, Leu, Trp, Tyr 

Cys,Tlir 

Ser, Val 

Phe, Tyr 

His, Phe, Trp 

De, Leu, Thr 



20 



25 



30 



Conservative anuno add substitutions generally Tnaintain (a) the structure of the polypeptide 
backbone in the area of the substitution, forexaniple, as a beta sheet cr a^ha helical confomatiCHi, 
(b) the charge or hydrofAobidty of the molecule at the site of the substitution, and/OT (c) the bulk of the 
side chain 

A "deletion" r^ers to a change in the amino acid or nuclec^de sequence that results in the 
absern^e of one or uKire amino acid residues or nucleotides. 

Theterm^deiivative^referstoacheQiicanynKxfifiedpolynudeotideori^ Chemical 
modifications of a polymideotide can include, for exanq)le, replacraneot of hydrogen by an alkyl, acyl, 
hydroxyl, or amino group. A denvative polynucleotide encodes a polypeptide which r^ains at least oie 
biological or immimological function of the natural molecule. A derivative polypeptide is one modified 
by glycosylation, pegylation, or any similar process that retains at least cxie biological or immunological 
function of the polypq>tide from wbich it was derived 

A "detectable labd** refers to a reporter molecule or enzyme that is capable of generating a 
measurable signal and is covalentiy or noncovaleotly jcnned to a polynucleotide cr pcAypeptide. 

"Differential expression" refers to increased or upr^ulated; or decreased, downregulated, cr 
absent gene or protein expression, determined by conq}aring at least two di£^^ Such 
comparisons may be carried out between, for example, a treated and an untreated sample, or a diseased 
and a normal sanq)le. 

A "fragment" is a unique pc^on of PRTS or the polynucleotide encoding PRTS which is 
identical in sequence to but shorter in length than the parrat sequrace. A fragment may compyrise up 
to the entire length of the defined sequence, minus one nucleotide/amino add residue. For example, a 
fragment may comprise from 5 to 1000 contiguous nudeotides or amino add residues. A fragment 
used as a probe, primer, antigen, therapeutic molecule, or for othes: puiposes, may be at least 5, 10, 



19 



wo 01/98468 PCT/USOl/19178 

15, 16, 20, 25, 30, 40. 50, 60, 75, 100, 150, 250 or at least 500 contiguous nucleotides Cff amino acid 
residues in length. Fragmwils may be preferentiany selected fium certain regions o For 
exaiiq>le, a polypeptide fragment may comprise a c^tain lengtli of contiguous amino acids selected 
m>m the first 250 or 500 amino acids ((M- first 25% or 50%) of a polypeptide as shown in a cm^ 
defined sequence. Qeariy these lengths are exemplary, and any length that is supported by the 
specificalion, including the Sequ^ce listing, tables, and figures, may be encon^iassed by the present 
embodiments. 

A fragment of SEQ ID NO:22-42 comprises a region of unique polynucleotide sequence that 
specifically identifies SEQ ID NO:22-42, for example, as distinct from any other sequence in the 
genome fi-om which the fragment was obtained. A fragment of SEQ ID NO;22-42 is useful, for 
example, in hybridization and amplification technologies and in analogous methods that distinguish 
SEQ ID N 0:22-42 from related polynucleotide sequences. The precise length of a fragment of SEQ 
ID NO;22-42 and die region of SEQ ID NO:22-42 to which the fragment corresponds arc routinely 
determinable by one of cwrdinary skill in the art based on the intaided purpose for the Augment 

A fragment of SEQ ID NO: 1 -21 is mcoded by a fragment of SEQ ID NO:22^2, A fi^gment 
of SEQ ID NO:l-21 comprises a region of unique amino add sequence that specifically identifies 
SEQIDNO:1.21. For example, a fragment of SEQ ID NO: 1-21 is useful as an immunograoic peptide 
for the development of antibodies that specifically recognize SEQ ID NO:l-21. The precise lengtii of 
- M^^umvw VTA ^A>v£ A-tw.i-iii. ana uic xci^uu ui cijcKj^ ID £hkj: 1-zi lo wmcn ine iragment 
corresponds are routinely determinable by one of ordinary skiU in the art based on the intended 
purpose for the fragment. 

A '*full length" polynucleotide sequence is one containing at least a translation initiation codon 
(e.g., methionine) followed by an open reading frame and a translation tranination codan. A **full 
length" polynucleotide sequence encodes a '*£ull leaigth"* polypeptide sequence. 

'Homolpgy*' refers to sequence similarity or, intercdiangeably, sequence identity, between two 
or more polynucleotide sequeix:es or two or more polypq>tide sequences. 

The tmns '"percent identity" and "% identity," as appHed to polynucleotide sequences, refer to 
the percentage of residue rnatches between at least two pGlynudectide sequent aligned using a 
standardized algorithnL Such an algorithm inay insert, in a staiKlardized and reproducible way, gaps in 
the sequaK:es beiiig coiig>ared in order to optimize aligniiient b^ween two sequel 
achieve a more meaningful conq)arison of the two sequences. 

Percent identity t)^ween polynucleotide sequences may be determined using the default 
parameters of ttie CLUSTAL V algOTithm as incorpcyated into the MEGALIGN veraon 3. 12e sequence 
alignment program. This program is part of the L ASERGENE software package, a suite of molecular 
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biological analysis programs (DNASTAR, Madison WI), CXUSTAL V is described in Higgins, D.G. 
and PJM. Sharp (1989) CABIOS 5:151-153 and in Higgins, D.G. et al. (1992) CABIOS 8:189-191. 
Far pairwise alignments of polynucleotide sequences, the default parameters are set as follows: 
Ktupl&=:2, gap penalty^S, window^, and "diagonals saved"=4. Hie **weii}itetf * residue weight table is 
selected as the default Percent identity is repcxted by CLUSTAL V as the "parent simil^^ 
aligned polynucleotide sequences. 

Alternatively, a suite of commonly used and freeiy available sequence CDnq)arisoai algcrithms is 
provided by the National Colter for Biotechnology Inf(Kination (NCBT) Basic Local Alignment Search 
Tool (BLAST) (Altschul, S.F. et aL (1990) J. MoL Bid. 215:403-410), vMch is available jfrom several 
sources, including the NCBI, Betbesda, MD, and on tiie Intenn^ at 

http://www.ncbi.nlm.nih.gov/BLAST/, The BLAST software suite includes various sequence analysis 
programs including "blastn," that is nsed to align a known polymicleotide sequence with other 
polynucleotide sequences from a variety of databases. Also available is a tod called ^LAST 2 
Sequ»K:es" that is used for direct pairwise con^arison of two nucleotide sequences. *'BLAST 2 
SequCTces" can be accessed and used interactivdy at fattpy/www.ncbLnhn.nih.gov/gor^l2.html. The 
''BLAST 2 Sequences" tool can be used for both blastn and blas^ (discussed below). BLAST 
programs are commonly used with gap and other parameters set to default settings. For exan^Ie, to 
compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequoices" tod Version 
2.0. 1 2 (.^iril-21-2000) set at default parann^ers. Sucb default param^ers may be, for exanq)le: 

Matrix: BLOSUM62 

Reward for match: 1 

Penalty for mismatch: -2 

Open Gap: 5 and Extension Gap: 2 penalties 

Gap X drop-off: 50 

Expect: 10 

Word Size: 11 

Filter: on 

Percent identity may be measured over the length of an entire defii^ sequence, for example, as 
defined by a particular SEQ ID number, (x may be measured over a shorter length, forescan^le, over 
the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at least 20, at 
least 30, at least 40, at least 50, at least 70, at least 100, ot at least 200 contiguous nucleotides. Such 
lengths are exemplary only, and it is understood that any fragment length supported by the sequences 
shown herein, in the tables, figures, or Sequax:e Listing, may be used to describe a length over whidi 
percentage idaitity may be measured. 
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Nuddc acid sequences that do not show a bigb d^ee of identity may neverthdess encode 
similar amino acid sequences due to the d^eneracy of the genetic code: It is understood that changes in 
a nucldc add sequence can be made using this degeneracy to produce multiple nucleic add sequences 
that an encode substantially the same protein. 

Hie phrases "percent identity" and "% identity," as applied to pdypqstide sequences, refer to 
the percentage of residue matches b^ween at least two polypeptide sequences aligned using a 
standardized algorithm. Methods of polypq)tide sequence alignmioit 'are w^-knowa Some alignment 
methods take into account conservative amino add substituticxis. Sudi conservative substitutions, 
explained in mere detail above, goierally ijreserve the charge andjtiydrophobidty at the site of 
substitution, tlms preserving the structure (and tbereftx-e iunctioo} of the polypq>tide. 

Percent identity between polyp^itide sequences may be determined using the default parameters 
of ttie CLUSTAL V algorithm as incorporated into tt» MEGALIGN version 3. 12e sequence alignment 
program (described and referenced above). For pairwise aiignTnftntQ of polypeptide sequences using 
CLUSTAL V, the default param^ers are set as follows: Ktuple=l, gap p«ialty=3, window=5, and 
"diagonals saved"=5. The PAM250 matrix is sdected as the d^ault residue weight table. As with 
pdynucleoticte alignments, the percent identity is rqKKted by CLUSTAL V as the "percent simflarity" 
between aligned polypeptide sequence pairs. 

Alternatively the NCBI BLAST soitware suite may be used. For exanq)le, for a pairwise 
comparison of two polyp^tide sequences, one may use the *'BLAST 2 Sequences" tool Version 2.0.12 
(April-21-2000) with blastp set at default parameters. Such default param^ers may be, for exBmple: 

Matrix: BLOSUM62 

Open Gap: J I and Extension Gap: J penalties 
Gap X drop'Cfff: 50 
Expect: 10 
Word Size: 3 
Filter: on 

Percent identity may be measured over the length of an esstiro d^ned polypeptide sequence, f<xr 
exanq>le, as defined by a particular SEQ ED number, or may be measured over a shorter length, f or 
example, over the length of a fragment taken from a larger, defined polypeptide sequoKre, for instance, 
a fragment of at least 15, at least 20. at least 30, at least 40, at least 50, at least 70 or at least 150 
contiguous residues. Such lengths are exeiiq)lary only, and it is imderstood that any fragment length 
suppcvted by the sequences shown herein, in the tables, figures or Sequence Listing, may be used to 
describe a length over which percentage identity inay be measured. 
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"Huxnan artificisQ cboroinosomes" (HACs) are linear microcliromosoines wbich may contam 
DNA sequences of about 6 Id? to 10 Mb in size and which contain all of the dements required for 
chromosome rq)lication» s^egation and maintenance. 

The tenn "faaimanized antibody" refers to an antibody molecule in which the amino add 
5 sequexice in the non-antigen binding r^ons has been altered so ttiat the antibody more dosely 
resembles a human antibody, and still r^ains its original binding ability. 

"Hybridizatioa*' refers to the process by which a polynucleotide strand annfaiig with a 
conq>lementary strand through base pairing under hybridization conditions. Spedtic 

hybridization is an indication that two nucleic add sequences share a high degree of complementarity. 

10 Specific hybridization conq>lexes form under permissive annealing conditions and remain hytiridized 
after the "washing*' stq>(s). The washing step(s) is particularly in^xxtant in ctetermining the stringency 
of the hybridization process, with more strii^eut conditions allowing less non-spedfic binding, i.e., 
binding between pairs of nuddc add strands that are not perfectiy ingtriyri Permissive conditions for 
annealing of nuddc add sequmces are routinely d^erminable by cme of ordinary skill in the art and 

15 may be consistent amcHig hybridization experiments, whereas wash ccHKiitions may be varied among 
experimeDls to achieve the desired stringency, and therefore hybridization spedfidty. Permissive 
annealing conditions occur, foe sTcanqple, at 68^C in the presence of about 6 x SSC, about 1% (w/v) 
SI>S, and about 100 ^g/ml sheared, denatured salmon sperm DNA. 

Generally, string^icy of hybridization is expressed, in part, with reference to the temperature 

20 under which the wash step is carried out Such wash temperatures are typically selected to be about 
5 'tZ to 20^ lower than the thermal melting point (Tq) for the specific sequ«x:e at a defined ionic 
strength and pH. The T^^ is the teinperature (under defined ioauc streiigth and pH) at \^c^ 
target sequence hybridizes to a perfectly matched probe. An equation for calculating T^ and conditions 
for nucleic add hybridization are wdl known and can be found in Sanibrools; J. et al. (1989) Molecular 

25 aoning: A LaboratOTV Manual, 2"^ ed., vol. 1-3, Cold Spring Harbor Press, Plainview NY; spedficaDy 
see volume 2, chapter 9. 

Hi^ stringency conditions for hyt)ridization between polynucleotides of the present invention 
include wash conditions of 68°C in the preseaice of about 0.2 x SSC and about 0.1% SDS, for 1 hour. 
Altemativdy, ten^atures of about 65^*0, 60**C, 55**C, or 42**C may be used. SSC concentration may 

30 be varied from about 0.1 to 2 x SSC, with SDS being present at about 0.1%. Typically, blocking 
reagents are used to block non-spedfic hybridization. Such blocking reagents indude, for instance, 
sheared and denatured salmon sperm DNA at about 100-200 ^g/ml. Organic solvent, such as 
formamide at a concentration of about 35-50% v/v, may also be used under particular drcmnstances, 
such as for RNA:DNA hybridizations. Useful variations on these wash conditions will be readily 
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^arenttotboseof CH^dinary skinintheait. Hybridization, pailiculaiiy iinder hi^ stringency 
conditions, may be suggestive of evolutionary sinoilarity bet^ Such similarity is 

strongly indicative of a similar role for Hie nucleotides and tbeir encoded polypeptides. 

Hie term '^hybridization conq)lex" refers to a cooqilex formed between two nucleic add 
sequeaices by virtue of the formation of hydrogen bonds b^ween con^lementary bases. A hjlMidization 
con^lex may be formed in solution (e.g.. Cot or Rot analysis) or fanned between one nucleic add 
sequence present in solution and another nudeic add sequexice immobilized on a soilid support (e^g. , 
paper, membranes, filters, chq>s, pins or glass slides, or any other zppcapnsi& substrate to which ceils 
or thdr nucleic adds have been fixed). 

The words "insertioo" and "addition" refer to changes in an amino add or nucleotide sequence 
resulting in the addition of one or uKire amino add residues or nucleotides, respectively. 

"Immune response" can refer to conditions assodated with inflammation, trauma, immune 
disorders, or infectious or genetic disease, etc. These conditions can be diaracterized by expression of 
various factors, e.g., cytokines, chemokines, and other signaling molecules, \^chmay affect cellular 
and systemic defense systems. 

An "immunogenic fragment** is a polypeptide or oligopeptide fi-agment of PRTS vrtiich is 
capable of didting an immune response v/ben introduced into a living organism, for exaiqple, a 
mammal . The term *'inmmnogenic fragment" also includes any polypqptide or olig(^)eptide fragnoent of 
PRTS ^ch is useful in any of the antibody production methods disdosedberdn or kro 

The term **microarray'* refers to an arrangement of a plurality of polynudeotides, polypeptides, 
or other chemical compounds on a substrate. 

The terms "element" and "array danent" refer to a polynucleotide, polypeptide, or other 
chemical compound having a unique and defined position on a microarray. 

Theterm"nKxiulate"referstoachangeintheactivity of PRTS. For exanq}le, modulation may 
cause an inoease or a decrease in protein activity, binding characteristics , or any other biological, 
functional, cr inmmnological properties of PRTS. 

The phrases **nuddc add" and "nudac add sequence" refer to a nudeotide, oligonudeotide, 
polynucleotide, or any fi-agment thereof. TTiese phrases also refer to DNA or RNA of genomic cr 
synthetic origin v^ch may be singHe-stranded or double-stranded and may represent the sense or the 
antiseaise strand, to peptide nuddc add (PNA), or to any DNA-like or RNA-like material 

"Qperably linked" refers to the situation in which a first nucldc acid sequence is placed in a 
functional relationship with a second nucleic add sequence. For instance, a promoter is operably 
linked to a coding sequence if the promoter affects the transcription or expression of the coding 
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sequence. Operably linked DNA sequ^oces may be in close proximity or contiguous and, where 
necessary to join two protein coding regions, in the same reading frame. 

"Pqjtide nucleic add" (PNA) refers to an antisense nK)lecule or and-gene agent vMctx 
coii^)rises an oligonucleotide of at least about 5 nucleotides inlengthlinked to ap^debaclfbone of 
amino add resic^es ending in lysine. The terminal lysine confers solubility to the co«i?X)sitioa PNAs 
preferratiany bind conq)lenieaitary single stranded DNA or RNA and sto^ and 
may be p^lated to extend their lifespan in the cell. 

**Post-translatiQnal modification" of an PRTS may invoive lipidation, glycos)1ation, 
phosphorylaticm, acetylation, racemization, proteolytic deavage, and other modifications known in the 
art Tbeseprocessesnoayoccur synthetically or biochemically. Biochemical modifications win vary by 
cdl type dq)ending on the enzymatic milieu of PRTS. 

"Ptobe" refers to nuddc add sequences encoding PRTS, their complements, or fragments 
thereof, which are used to detect id^cal, allelic or related nucldc add sequmces. Probes are 
isolated oligonucleotides or polynucleotides attached to a detectable label or rqKCto- mdecule. Typical 
labels indude radioactive isotopes, ligands. chemiluminescent agents, and enzymes. "Primas" are 
short nucldc adds, usually DNA oligonudeotides, which may be annealed to a target polynudeotide by 
coiiq>lementary base-pairing. The priiner may then be eartemkd along the target DNA strand by a DNA 
polymerase enzyme. Primer pairs can be used for anq)lification (and identification) of a nuddc add 
sequence, e.g., by the polymerase chain reaction (PGR). 

Probes and primers as used in the present invention typically conqnise at least 15 contiguous 
nudeotides of a known sequence. In order to enhance spedfidty, longer probes and primers may also 
be employed, such as probes and primers ttiat comprise at least 20, 25, 30. 40, 50, 60. 70, 80, 90, 100, 
or at least 150 consecutive nudeotides of the disclosed nucldc add sequences. Probes and primal may 
be considerably longer than these examples . and it is understood that any length supported by the 
specification, including the tables, figures, and Sequence Listing, may be used. 

Methods for preparing and using probes and primers are described in the references, for 
exan5>le Sambrook, J. et aL (1989) Molecular Cloning: ALabcratnry Matinfli^ 2?*^ ed.. vdL 1-3, Cold 
Spring Harbor Press. Plainview NY; Ansubd, RM. et al. (1987) Curreart Protocols in Molecular 
Bidogy, Greene PubL Assoc. & Wiley-Intersdences, New York NY; Innis, M. et aL (1990) PCR 
Protocols. A Guide to Methods and Applications. Acadmiic Press, San Diftgn r A PCR primer pairs 
can be derived fi*om a known sequence, for exanq>le. by using con^uter programs intended for that 
purpose such as Primer (Version 0.5. 1991, Whitehead Institute for Biomedical Research, Cambridge 
MA). 
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Oligomclecftides for use as pdmers are selected using software known in the art for such 
purpose. Fot example. OUGO 4.06 software is useful for the sdection of PGR prin^ 
ICX) nucleotides each» and for the analysis of otLigcamdeotides and larger pcdynudeotides of up to 5,000 
imdeotidesfironoi an iiqmt polynucleotide sequence of up to 32 Idlobas^ Similar primer sdection 
programs have incorporated additional features for expaiuled c^abilities. For exaiople^ the PrimOU 
primer sdection program (available to the public from the Genome Center at University of Texas South 
West Medical Center, Dallas TX) is enable of choosing spedbac primers frcmmegabase sequences 
and is thus useful for designing primers on a genome-wide scope. The Frimer3 primer sdection 
program (availabie to the public from tte WMtefaead Institute^MTT Center for Genome Research, 
Cambridge MA) allows the user to input a ''mispnming library," in which sequences to avoid as primer 
binding sites are user-specified. PrimerS is useful, in particular, for the sdection of oligosmcleotides for 
microarrays. (The source code for tibe latter two jmrner selection programs niay also b^ 
their respective sources and modified to meet the user's specific needs.) The PrimeGen program 
(available to the public firom the UK Human Genome Mapping Project Resource Centre, Cambridge 
UK) designs primers based on multiple sequence alignments, thereby allowing sdection of primers that 
hybridize to dther the most conserved or least conserved r^ons of a|ignp<i middc acid sequences. 
Hence, this program is useful for identification of both unique and conserved oligonucleotides and 
polynudeotide fragments. The oligomideotides and polynudeotide fragments idemified by any of the 
above sdection methods are useful in hybxidizatiOQ technologies, for exan^le, as PCR or sequencing 
prin^rs, microarray dements, or specific probes to identify fully or partially con^lementary 
po^ynudeotides in a 5an9>le of middc adds. Methods of oligonudeotide sdection are not limited to 
those described above. 

A '"recombinant nuddc add" is a sequence that is not naturally occurring or has a sequence 
that is made by an artificial combination of two or more otherwise sqparated segments of sequence. 
This artificial combination is often acconq}lished by diemica] synthesis or, more commonly, by the 
artifidal manipulation of isolated segments of nuddc adds, e^g., by genetic engineering techniques 
sudi as those described in Sambroolc supra . Hie term recombinant includes nucldc adds that have 
been altered soldy by addition, substitution, or ddetion of a portion of the nuddc add. Irequently, a 
recombinant nucldc add may indude a nuddc add sequence operably linked to a prcmoto* sequence: 
Such a recombinant nuddc add may be part of a veaor that is used, for example, to transf onn a cdl. 

Altemativdy, such recombinant micldc adds may be part of a viral vector, e.g., based on a 
vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nuddc add is 
expressed, indudng a protective immimological response in the mammal. 



26 



wo 01/98468 PCT/USOl/19178 

A **r^ulatory eleznem" refers to a micieic acid sequeace usually derived from untranslated 
r^ons of a geoe and includes enhancers, promoters, introns, and 5' and 3* untranslated i^cns (UTRs). 
Regulatory elements interact with host or viral proteins which control transcnption, translation, or RNA 
stability. 

'^Reporter molecules" are chffnical or biocbenucal moires used for labding a nucleic add, 
amino add, or antibody. RqKXter molecules include radicHuidictes; enzymes; fluorescent, 
chemiluminescent, (»■ chromogenic agents; substrates; cofactors; inhibitors; magn^c particles; and 
other moieties known in the art. 

An "RNA equivalent,** in reference to a DNA sequaice, is conqx^sed of the same linear 
sequence of nucleotides as the reference DNA sequence with the exceptiCMi that an occurrences of tl^ 
nitrogenous base diymine are replaced with uracil, and the sugar backbone is con^xised of ribose 
instead of deoxyribose. 

The term "sample" is used in its broadest sense. A sample suspected of containing PRTS, 
nucleic adds encoding PRTS, or fragnients thereof may comprise a bodflyflui^ an extract from a ceD, 
chromosome, OTganelle, or membrane isolated from a cefl; a cefl; goKHnic DNA, RNA «* cDNA, in 
solution or bound to a substrate; a tissue; a tissue print; etc. 

The terms "specific binding" and "specifically binding" refer to that interaction between a 
protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural ca- 
synthetic binding conqxisition. The interaction is dqj^Klent upon the presence of a particular structure 
oftheprotein,e.g., the antigenic determinant <arq)itope;, recognized by the binding DK^^ For 
exanq)le, if an antibody is specific for epitope " A, " the presence of a polyp^tide comprismg the epitope 

A, or the presCTce of free unlabded A, in a reaction containing free labded A and the a^ 
reduce the amount of labeled A that binds to the antibody. 

The term "substantially purified" refers to nucleic add or amino add sequences that are . 
removed from their natural enviromnoit and are isolated or sq>arated, and are at least 60% free, 
preferably at least 75 % free, and most prefer^ly at least 90% free frcwn other components with uhich 
they are naturally associated. 

A "substitution" refers to the replacement of gd& or more amiiio add residues or nucleotides by 
different amino add residues or nudeotides, respectively. 

"Substrate" refers to any suitable rigid or semi-rigid support including membranes, filters, 
chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
micropartides and capillaries. Tte substrate can have a variety of surface forms, such as w^, 
trenches, pins, diannels and pores, to which polynudeotides or polyp^tides are bound. 
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A **transaript image" refers to the collective pattern of gene expression by a particular cdl type 
or tissue under given conditions at a ^ven time. 

^Transformation" describes a process by ^liich exc^enous DN A is introduced into a redpient 
cdl. TransfiHmatioa may occur under natural or artificial conditions according to various methods well 
5 known in the art, and may rely on any Icnown m^od for tiie insertion of foreign nucleic add sequences 
into a prokaryotic cr eukaryodc host cell The method for transformation is selected based on the type 
of host cell being transformed and may include, but is not liniited to» bacteriophage or viral infection, 
electr<poration, heat shock, lipofection, and particle bombardment The term 'transformed cells" 
includes stably transformed c^ in which the inserted DNA is capable of r^lication dther as an 
10 autonomously replicating plasmid or as part of the host chromosome; as weil as transiently transformed 
cdls whidi express the inserted DNA or RN A for limited periods of time. 

A **transgenic organism/* as used herein, is any organism, induding but not limited to 
animals and plants, in which one or more of the ceUs of the organism contains heterologous nucleic 
add introduced by way of human interventioii, such as by transgenic techniques well known in the 
15 art The nucldc add is introduced into the cell, directly or indirectly by introduction into a precursor 
of the cdl, by way of deliberate genetic manipulation, such as by microinjection or by infection with 
a recombinant virus. Hie term genetic manipulation does not include classical cross-breeding, or in 
vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The 
transgenic organisms contemplated in accordance with the present invention include bacteria, 
20 cyanobacteria, fungi, plants and animals. The isolated DNA of the present invention can l>e 
introduced into the host by methods known in the art, for example infection, transfection, 
transformation or transconjugation. Techniques for transferring the DNA of the present invention 
into such organisms are widdy known and provided in references such as Sambrook et al. (1989), 
supra . 

25 A 'Variant" of a particular nuddc add sequence is defined as a nucldc add sequence having at 

least 40% sequence identity to the particular nuddc add sequence over a certain length of one of the 
nucldc add sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 CMay-07-1999) 
set at default parameters. Such a pair of nuddc adds may show, for exan^le, at least 50%, at least 
60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at 

30 least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequmce 
identity over a certain d^uied length. A variant may be described as, for exanq>le, an "allelic" (as 
defined above), "splice," "spedes," or "polymorphic" variant A splice variant may have significant 
identity to a reference molecule, but win generally have a greater or lesser number of polynudeotides 
due to alternative sphdng of exons during mRN A processing. The corresponding pdypeptide may 
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possess additional fimctipDal domains or lack domains that are present in tbe reference mc^ecula 
Species variants are polynucleotide sequences that vary from one species to another. Tbe resulting 
polypeptides win generaEy have significant amino add i^ntity relative to each Apolymorphic 
variant is a variation in the i>olymicleotide sequence of a particular gene betwe^ individuals of a given 
5 species. PolynKfirpMc variants also may encQn^)ass "single nucleotide pdymcrphisms" (SNPs) in 
which the polynucleotide sequmce varies by cme nucleotide base. Tbe presence of SNPs may be 
indicative of; for example; a certain population, a disease state, or a propensityYor a disease state. 

A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having at 
least 40% sequence idoitity to the particular polypeptide sequence over a certain length of one of the 
10 polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) 
set at default parametos. Such a pair of polypeptides may show, for example, at least 50%, at least 
60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at 
least 95%, at least 96%, at least 97%. at least 98%, or at least 99% or greater sequence identity over a 
certain defined l^gth of one of the polypeptides. 

15 

THE INVENTION 

The invention is based on the discovery of new human proteases (PRTS), the polynucleotides 
encoding PRTS, and the use of these compositions for the diagnosis, treatment, or prevention of 
gastrointestinal, cardiovascular, an tniTnTnimft/ inflgTnTnatriry^ cell proliferative, developmental, epithelial, 

20 n^irological, and reproductive discH'ders. 

Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 
sequences of the invention. Each polynucleotide and its corresponding polypeptide are correlated to a 
single Incyte project idQititicationnuniber(IncyteI^oject^^ Each polyp^tide sequence is denoted 
by both a polypeptide sequence identitication number (Polypeptide SEQ ID NO:) andanlncyte 

25 polypeptide sequence number (Incyte Polypeptide ID) as shown. Each polynucleotide sequence is 

denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ ID NO:) and an 
Incyte polynucleotide consensus sequence number (Incyte Pdymideotide ID) as shown. 

Table 2 shows sequences with bonM>logy to the xx>lypeptides of the invention as idoitified by 
BLAST analysis against the GenBank protein (genpept)databaseL Columns 1 and 2 ^k>w the 

30 polypeptide sequence id^itification number (Polypqrtide SEQ ID NO:) and the corresponding Incyte 
polypeptide sequence number (Incyte Polypeptide ID) for polypeptides of the invention. Cdunm 3 
shows the GenBank identification number (Goibank ID NO:) of the nearest GenBank homolog. 
Column 4 shows the probability score for tbe matdi b^we^ each polypq)tide and its GenBank 
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homoipg. Coluum 5 shows tbe annotation of tbe GeiiBank bomolog along with r^evant citations where 
applicable* an of which are expressly incorporated by reference herein. 

Table3 shows various structural features of the polypq>tides of the invention, Cohunns 1 and2 
show the polypeptide sequence identification number (SEQ ID NO:) and Ihe ccxresponding incyte 
5 polypq)tide sequence number (Incyte Pdyp^tide ID) for each polypq^^tide of tiae invention. Colunm3 
shows the nuihber of amino acid residues in each polypeptide. Column 4 shows potential 
phosphorylation 5ites» and column 5 shows potential glycosylation sites, as determined by tbe MOTIFS 
program of the GCG sequence analysis software package (Genetics Computer Group, Madison WI). 
Colunm 6 shows amino acid residues conqinsing signature sequences, domains, and motifs. Column 7 

10 shows analytical methods for protein structure/flaKtion analysis and in some cases, searchable 
databases to which the analytical methods were applied. 

Together, Tables 2 and 3 summarize the prc^>erties of polypeptides of the invention, and these 
properties establish that the claimed polypeptides are proteases. For exan9>le, SEQ ID NO:l is a 
ubiquitin carboxyl terminal hydrolase. SEQ ID NO:l is 48% idoitical^ from resichie Ml to residue 

15 G225, to human ubiquitin-speciQc processing protease (GenBank ID gSW1757) as determii^ by the 
Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST i»:6bability score is 1 .OOe- 
49 . which indicates the probability of obtaining the observed polypeptide sequence alignment by chanccw 
SEQ ID NO:l contains a ubiquitin carboxyl terminal hydrolase catalytic site domain as determined by 
searching for statistically significant matnhpg in tiie hidden Markov model (HMM)-based PF AM 

20 database of conserved protein family domains. The score is 53.4 bits and the E-value is 4.9e-12, vduch 
indicates the probability of obtaining the observed stmctural motif by chance. Tbe presence of this 
motif was ccrroborated by BLIMPS (probability scor^2.<Se-4) and MOTIFS analyses. This provides 
further evidence that SEQ ID NO:l is a ubiquitin carlxscyl-terminal bydrdlasa In an alternative 
example, SEQIDNO:2is45% identical to amino adds 15-235 of human prostasin, a serine protease 

25 (GoiBank ID gl 143194) as determined by the Basic Local Alignment Search Tool (BLAST). (See 
Table 2.) The BLAST iH-obabiUty score is 1.3e-46, v^dridi indicates the probability of o^ 
observed polypeptide sequence alignment by dia]x:e; SEQ ID NO:2 also contains a trypsin family 
serine protease active site domain as determined by searching for statistically significant matches in the 
hidden Markov model (HMM)-based PFAM database of conserved protein family domains. This 

30 matcii has a probability scotc of 2,7e-58. BLIMPS. MOTIFS, and PROFILESCAN analyses confirm 
the presence of this domain. (See Table 3.) BLIMPS analysis also reveals a kringle domain, providing 
furtter corroborative evidence that SEQ ID NO:2 is a serine protease of the trypsin family. In an 
alternative exan^le. SEQ ID NO:7 is a dip^tidase vMcii hydrolyses a variety of pqytides (Kozak, E. 
and S. Tate (1982) J. Biol. Chan. 257:6322-6327), and is responsible fw the hydrolysis of the beta 
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lactam rings of antibiGtics surfi as penem and carbapoiem (Campbell et aL, (1984) J. Bid. Chem. 
259:14586-14590). SEQ ID NO:7 shows 48% amino acid sequence identity over 377 amino adds 
(total length equals 411 anoino acids) to human dipeptidase precursor ((jenBank ID g219600) as 
determined by Basic Local AligEmaait Seardi Tool (BLAST). The BLAST probability score is l.le- 
5 88, vMch indicates the probability of (Staining the observed polypeptide sequence alignment by chance. 
Additionany, the protease of the invention demonstrates a renal dipeptidase d(^^ 
searching for statistically significant matdies in the hidden Markov model (HMM>based PFAM 
database of conserved protean family (knnains. The HMM score for the renal dipeptidase PFAM hit is 
412.7. Data from BLIMPS, MOTIFS, BLAST-DOMO, and BLAST-PRODOM analyses iHX>vide 

10 further ccHTObCH-ative evidence that SEQ roNO:7 is a reaial<fipeptidas^ The BLIMPS-BLOCKS hit 
scores for localized r^ons range from 1040-1537. The BLAST-DOMO hit probability scwe is 5.2e- 
85. The BLAST-PRODOM hit probability score is 4.7e-73. In an alternative example, SEQ ID NO:8 
is 86% identical to human transroonbrane tryptase (GenBank ID g6103629) as determined by the Basic 
Local AlignmMit SearcJi Tool (BLAST). (See Table Z) The BLAST protoabiUty score is 3.9e-166, 

15 which indicates the probability of c^taining the observed polypeptide sequence alignmoxt by chance. 
SEQ ID NO:8 contains a trypsin family protease active site domain with a probability score of 5.3e-89 
as d^ermined by searching for matches in the hidden Markov model (HMM)-based PFAM database of 
conserved protein famQy domains. BLIMPS, MOTIFS, and PROFDLESCAN analyses confirm the 
presence of this motif. BLIMPS analysis also shows that SEQ m NO:8 coaitains a loingle domain an^ 

20 a type I fflnonectin domain. HMMER-based analysis reveals the presem» of a transmembrane domain 
(See Table 3.). Taken tpgrther, these analyses show that SEQ ID NO:8 is a trans m e m brane member of 
thetrypsinfamily of serine proteases. In an alternative exaniple, SEQ ID NO; 17 shares 44% local 
identity with human membrane-type serine protease 1 (MT-SPl, GenBank ID g6002714) as determined 
by the Basic Local AUgnmeast Search Tool (BLAST). (See Table 2.) The BLAST probability score is 

25 5.1e-94, i^ch inriiratP^f the probability of obtaining the observed polypeptide sequence alignment by 
chance. SEQ ID NO:17 contains a trypsin fanrily serine protease active site domain as deterni^^ 
searching for statistically significant matc^ in the hidden Markov modd (HMM)-based PFAM 
database of conserved protein family domains. (See Table 3.) HMM-based analysis also reveals a 
transmembrane domain near the N-terminos of SEQ ID NO: 17. A domain found in the low-density 

30 lipoprotein receptw and other proteins, inchjding MT-SP 1 (PDOC00929) was also identified in this 
way. The presence of the trypsin active site motif is confirmed by PROFILESCAN. BLIMPS, and 
MOTIFS analyses. BLIMPS analysis revealed the presmce of kringle and type I fibronectin domains. 
Taken together, tliese data provide further corroborative evidence that SEQ ID NO: 1 7 is a 
transmembrane member of the trypsin family of serine proteases. SEQ ID NO:3-6, SEQ ID NO:9-l 6, 
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assembled using cDNA sequences or coding (exon) sequences derived from genomic DNA, or any 
5 combination of these two types of sequences. Columns 1 and 2 list the pdymicleotide sequence 



consensus sequence number (Incyte Polynucleotide ID) for eadi polynucleotide of tbe invention:- 
Column 3 stK>ws the length of eacli polynucleotide sequence in basq>airs. Column 4 lists fragments of 
the polynucleotide sequences which are useful, for example, in hybridization or an^lification 

10 technologies that identify SEQ ID NO:22-42 or that distinguish b^ween SEQ ID NO:22-42 and 
related polynucleotide sequences. Column 5 shows identification numbers corresponding to cDNA 
sequences, coding sequences (exons) predicted from genomic DNA, and/or sequence assemblages 
comprised ofbothcDNA and genomic DNA These sequoices were used to assemble the full length 
polynucleotide sequences of the invention. Columns 6 and 7 of Table 4 show the nucleotide start (S') 

15 and stop (3^) positions of the cDNA and/or genomic sequ^ces in column 5 relative to their respective 
ML length sequences. 



identification number of an Incyte cDNA sequence, and PROSTMYOl is the cDNA library from v/bich 
20 it is derived. Incyte cDNAs for which cDNA libraries are not indicated were derived from pooled 

cDNA libraries (eg.. 71041539V1). Altemativdy, the idsatificatiCHi numbers in column 5 may refer to 
GenBank cDNAs or ESTs (e.g., g5745066) wbidi contributed to the assembly of the full length 
polynudeodde sequences. Altemafively, the identification numbers in column 5 may refer to coding 
regions predicted by Genscan analysis of genomic DNA. For example, 
25 GNN.g720875 l_000002_002.edit is the identification number of a Genscan-predicted coding sequence, 
with g7208751 being the GenBank identification number of the sequence to wtiich Genscan was 
^ppUedL The G^iiscan-predicted coding sequences may have be^ edited prior to assembly. (See 
Example IV.) Alternatively, the identificafiGn numbers in ccdumn 5 may refer to assemblages of both 
cDNA and Genscan-precficted exons brou^ tpgetitier by an "exon stitching^ algorithm. Fcx* example, 
30 FLl 389845_(XK)01 represents a "stitched" sequence in which 1 389845 is the ideaafification munber of 
the cluster of sequences to which the algorithm was applied, and 00001 is the number of the prediction 
generated by the algorithm. (See Exan5)le V.) Alternatively, the identification numbers in column 5 
may refer to assemblages of both cDNA and Genscan-predicted exons brought together by an "exon- 
stretching*' algorithm. For example, FL2256251_g7708357_006002_g6103629 is the idaitification 




ing Incyte pdynudeotide 
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number of a "stretchecf ' sequoice, widi 2256251 bdng the Incyte project identification number, 
g7708357 beaiig tbe GenBank ideaitificaticai mmber of the human genomic sequence to y/tnch the 
"exon-strctching" algcfithm was applied, and g6103629 being the GenBank i de ntifica t ion numiber of 
the nearest GenBank protein homolpg. (See Example V.) In some cases. Incyte cDNA coverage 

5 redundant with the sequence coverage shown in colxunn 5 was obtained to confirm the final consensus 
polynucleotide sequence, but tiie relevant Incyte cDN A identification numbers are not shown. 

Table 5 shows the rqjoresentative cDNA libraries for those full length pdynudeotide sequences 
v^^ch were assearibled using Incyte cDNAsequaices. The representative cDNA library is the Incyte 
cDN A library which is most frequently rq>resented by the Incyte cDN A sequoices v/bkh were used to 

10 assemble and confirm the above polynucleotide sequences. The tissues and vectors which were used to 
construct the cDNA libraries shown in Table 5 are described in Table 6. 

The invontian also enconqpasses PRTS variants. A preferred PRTS variant is one ^wblch has at 
least abiKit 80%. or altemativdy at least about 90%, or even at least about 95% amino add sequeace 
identity to the PRTS amino add sequence, and which contains at least one fimctional or structural 

15 characteristic of PRTS. 

The invention also enc(HX^)asses polynucleotides which encode PRTS. In a particular 
embodiment, the invention encompasses a polynucleotide sequence conqirising a sequence selected fi-om 
the group consisting of SEQ ID NO:22-42, which encodes PRTS. The polynucleotide sequences of 
SEQ ED NO:22-42. as presented in tbe Sequaace Lasting, eanbrace the equivalent RNA sequences, 

20 whereinoccurrencesof the mtrogenous base thymine are replaced with urad^ and the sugar backbone 

is composed of ribose instead of deoxyribose. 

The invention also encompasses a variant of a polynucleotide sequence encoding PRTS. In 

particular, such a variant polynudeotide sequence win have at least about 70%, or altemativdy at least 
about 85 %. or even at least about 95 % polynucleotide seque^ice identity to the polynucleotide sequence 

25 encoding PRTS. Aparticular aspect ofthe invention enconq)asses a variant of a polynudeotide 

sequence comprising a sequence selected from the group consisting of SEQ ID NO:22-42 vMda. has at 
least about 70%. or altemativdy at least about 85%, ot even at least about 95% polynucleotide 
sequence idoitity to a nucldc add sequence selected firom the group consisting of SEQ ID NO:22-42. 
Any one of the polynucleotide variants described above can encode an amino add sequence wMdi 

30 contains at least one fimctional or structural characteristic of PRTS. 

It will be appreciated by those skilled in the art that as a result of the degeneracy of the genetic 
code, a multitude of polynucleotide sequences aicoding PRTS, some bearing minimal similarity to the 
polynucleotide sequences of any known and naturally occurring gene, may be produced. Thus, the 
invention contemplates each and every possible variation of polynucle(^de sequoice that could be made 
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by grfpr-Hng rrtmhiTiflti onR ha,qfifi on possible cQckm choices. Ttese comhinations are made in 
accordance with the standard triplet genetic code as ^plied to the polynucleotide sequence of naturally 
occurring PRTS, and all sucdi variations are to be considered as being specifically disclosed. 

Although nucleotide sequences whidti encode PRTS and its variants are generally capable of 

5 hybridizing to the nucleotide sequence of the naturally occurring PRTS under ^TprofHiatdy selected 
conditions of stringency, it may be advantageous to produce nucleotide sequences encoding PRTS or its 
derivatives possessing a substantially different codon usage* &g., inclusion of nonrnaturally occurring 
codons. CodonsnmybeseiectedtoiiKTeasetherateat which expression of the pqjtide occurs in a 
particular prokaryotic or eukaryotic host in accordance witli the frequency with whidi particular codons 

10 are utilized by the host Other reasons for substantially altering the nucleotide sequence encoding 
PRTS and its derivatives without altering the encoded amino acid sequences include the production of 
RNA transcripts having more desirable properties, such as a greater half-life, than transcrq>ts produced 
from the naturally occurring sequence. 

Tte invention also eocon9>asses production of DNA sequences vMoh encode PRTS and PRTS 

15 derivatives, or fragments thereof, entirely by synthetic chonistry. After production, the synthetic 
sequence may be inserted into any of the many available expression vectors and cell systems using 
reagents weU known in the art Moreover, synthetic chraiistry may be used to introduce mutations into 
a sequence encoding PRTS or any fragment thereof. 

Also enconq>assed by the invention are polynucleotide sequences that are enable of 

20 hybridizing to the claimed polynucleotide sequences, and, in particular, to those shown in SEQ ID 
NO:22-42 and fragments tiiereof under various conditions of stringaicy. (See, e.g., Wahl, G.M. and 
S Berger (1987) Methods Enzymd. 152:399-407; Kinmiel, AJl, (1987) Methods Enzymol. 
152:507-511.) Hybridization conditions, including annealing and wash conditions, are described in 
*T>efinitions." 

25 Methods for DNA sequencing are we31 known in ttie art and may be used to practice any of the 

embodiments of the invention. The methods may ensploy such enzymes as the Klenow fragment of 
DNA polymerase I, SEQUENASE (US Biocdiemical, Cleveland OH), Taq polymerase (AppHed 
Biosystems), thermostable T7 pdymerase (Amersham Pharmacia Biotecb, Piscataway NJ), or 
combinations of polymerases ami proofreading exonucleases such as those found in the ELONG ASE 

30 amplification system (Life Technologies, Gaitbersburg MD). Prefierably, sequence preparation is 

automated with Tnarhin^ snch as the MICROLAB 22(X) liquid fransfer system (Hamilton. Reno NV), 
PTC200 tiKrmal cycler (MJ Research, Watertown MA) and ABI CATALYST 800 thermal cycler 
(^plied Biosystons). Sequencing is then carried out nsing either the ABI 373 or 377 DNA sequectcing 
system (Applied Biosystems), the MEGABACE 1(XX) DNA sequencing system (Molecular Dynamics, 
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SuimyvaleCA)»orotbersysteraskiK>wnmtibeart The resulting sequences are amlyzed using a 
variety of algoiitiuns v^iiidi are weH known in the art (See» e.g., Ausubei. F.M, (1997) Shcjrt Protocols 
in Molecular Biology. Jobn Wiley & Sons. New Yak NY, unit 7.7; Meyers, R.A. (1995) Molecular 
Biology and Biotedinologv . Wiley VCH, New Ydk NY, pp. 856-853.) 

5 The nucleic acid sequeaices encoding PRTS may be extended utilizing a partial nucleotide 

sequence and employing various PCR-based methods known in the art to detect upstream sequences, 
such as promoters and regulatcyy dements. F<»- exaii^)le, one method which may be enq>loyed, 
restiictionrsite PCR, uses universal and nested primers to an5)lify unknown sequeiK» from genomic 
DNA within a cloning vectOT. (See, e.g., Sarkar, G. (1993) PCR Methods AppUc. 2:318-322.) 

10 Another method, inverse PCR, uses primers that extend in divergent directions to anqpiify unknown 

sequence from a circulanzed template. The teiiQ>late is derived from restriction fragmintts cornqprising a 
known genomic locus and surrounding sequences. (See, ag.. Triglia, T. et al. (1988) Nucleic Aads 
Res. 16:8186.) A third method, capture PCR, involves PCR amplification of DNA fragments adjacent 
to known sequences in human and yeast artificial cteomosome DNA. (See, e.g„ L^ersfrom, M. ^ aL 

15 (1991) PCR Mrthodsy^pplic. 1:111-119.) In this method, multiple restriction enzyme digestions and 
ligations may be used to insert an engineered double-stranded sequence into a r^on of unknown 
sequence before performing PCR. Ctther methods wtuchinay be used to retrieve unknown sequences 
are known in the art (See, e.g., Parker, J.D. et al. (1991) Nuddc Acids Res, 19:3055-3060). 
Additionally, one may use PCR, nested primers, and PROMOTERFINDER libraries (Clontech, Palo 

20 Alto CA) to walk genomic DNA. This procedure avcHds the need to screen IftHraries and is useful in 
finding infron/exon junctions, Fot aD PCR-based mrthods, primers may be designed using 
commercially available software, such as OUGO 4.06 primer analysis software (National Biosciences, 
Plymouth MN) or another appropriate program, to be about 22 to 30 nucleotides in loigtti, to have a 
GC content of about 50% cr more, aiKi to anneal to the template at teai5)eratures of about 68°C to 

25 72°C, 

When screening for full length cDNAs, it is preferable to use libraries that have been 
size-selected to Indude larger cDNAs. In additiCHi, random-primed libraries, which often include 
sequences containing the 5' regions of genes, are preferable fw situations m wMch an digo d(T) library 
does not yield a full-length cDNA. Genomic fibraries inay be iiseful eatension of sequaace into 5* 

30 nonrtranscribed r^ulatory r^ons. 

Capillary electrophoresis systems whidi are commercially available may be used to analyze the 
size or confirm the nucleotide sequence of sequencing or PCR products. In particular, capillary 
sequoicing may employ flowable polymers for rfectrophoretic separation, four different nucleotide- 
specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
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emitted wavelengths. Ou^ut/light intensity may be converted to electrical signal using appropiiate 
software (e.g., CTNOTYPER and SEQUENCE NAVIGATOR, .^lied Biosystems). and the entire 
process firom loading of sauries to computer analysis and electronic data display may be computer 
controlled. Capillary etectrophoresis is especially preferable fcH* sequencing small DNA fragments 

5 which may be present in limited amounts in a particular sample. 

In anotho* embodiment of the invention, polynucleotide sequences or fragmraits thereof \^ch 
encode PRTS may be doned in reconibinant DNA noolecules that direct expression of PRTS, or 
fragmeiits or fractional equivalents thereof, in ^ropriale host ceHs. Due to the inherent d^aoeracy of 
the genetic code, othff DNA sequ^KSS viMch encode substantially the same or a functionany equivalent 

10 amino acid sequence may be produced and used to express PRTS. 

The nucleotide sequences of the present inventicm can t)e engineered using methods generally 
known in the art in order to alter PRTS -encoding sequences for a variety of purposes including, but not 
limited to, modification of the cloning, processing, and/or expression of the gene product DNA 
jchiiffiing by random fragmentation and PGR reassembly of gene fragments and synthetic 

15 oligonucleotides may be used to engineer the nucleotide sequences. For exan:^le» oligonucleotide- 

mediated site-directed mutagenesis may be used to introduce mutations that create new restriction sites, 
alter glycosylation patterns, change codon preference, produce splice variants, and so forth. 

TtkR nucleotides of the present invention may be subjected to DNA shuffling techniques such 
as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent Number 

20 5,837,458; Chang, C.-C. et al. (1999) NaL Biotechnol. 17:793-797; Christians, RC. et al. (1999) Nat 
Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat BiotechnoL 14:315-3 19) to alter or 
in^jTOve the biological properties of PRTS, sudi as its biological or enzymatic activity cm: its ability to 
bind to otiiCT molecules or compounds. DNA shu£fling is a process by which a library of gene 
variants is produced using PCR-mediated recombination of gene fr^gmmts. Tbe library is then 

25 subjected to selection or screening procedures that identify those gene variants with the desired 

properties. Tliese preferred variants may then be pooled and further subjected to recursive rounds of 
DNA shufOing and selection/screening. Thus, genetic diversity is created timnigh "artificial" 
breeding and rapid molecular evolution. For example, fragments of a single gene containing random 
point mutations may be recombined, screened, and then reshuffled until the desired properties are 

30 optimized. Alternatively, fragments of a given gene may be recombined with firagmoxts of 
homologous genes in the same g&ne family, either firom the same or difiercnt species, thereby 
maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable 
manner. 
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In aiKTtba- mibodiraeot, sequences ^Kxxling PRTS may be synthesized, in whole cb: in part 
using chenucal m^ods well known in the art. (See, e.g.. Canithers, MH. et al, (1980) Nucleic Acids 
Symp. Ser. 7:215-223; and Horn, T. et al. (1980) Nudeac Adds Synqp. Ser. 7:225-23Z) Alternativdy. 
PRTS itself OT a fragment thereof nmy be synthesized using chemical methods. For exan5)le> pq)tide 

5 synthesis can be performed using various solution-phase or solid-phase techniques. (See, e.g., 

Creighton, T. (1984) Proteins. Structures and Molecular Properties . WH Freeman, New York NY, pp. 
55-60; and R6berge, J.Y. et al. (1995) Sdaice 269:202-204.) Automated synthesis may be achieved 
using the ABI 43 1 A peptide synthesizer (Applied Biosystems). Additionally, the amino add sequence 
of PRTS, or any part thereof, may be altered durli^ direct synthesis and/oar combined with sequences 

10 from other proteins, ot any part thereof, to produce a variant polypqptide or a polypeptide having a 
sequence of a naturally occurring polypeptide. 

The peptide may be substantially purified by preparative higih performance liquid 
chromatc^a?)hy. (See, e^g., Chiez. RJM. and F.Z. Regniex (1990) Methods EnzymoL 182:392-421.) 
The con^x>sitiOfi 6fti3e synthetic peptides may be confirmed by amino add analysis or by sequencing. 

15 (See. e.g., Creighton, supra, pp. 28-53.) 

In order to express a biologically active PRTS. the nudec^de sequences encoding PRTS or 
derivatives ttereof may be inserted into an apprc^riate expression vector, i.e., a vector which contains 
the necessary elements for transcriptiQnal and translaficnal control of the inserted coding sequence in a 
suitable host These dements include re^atory sequences, such as enhancers, constitutive and 

20 indudble promoters, and 5' and 3' untranslated regions in the vector and in polynucleotide sequences 
mxK&i^PRTS. Sudi dements may vary in their strength and spedfidty. Specific initiation signals 
may also be used to acMeve more effident translation of sequences encoding PR'^ Suchsignals 
include tb& ATG initiation codon and adjacent sequences, e.g. the Kozak sequence. In cases where 
sequences oKXXiing PRTS and its initiation codon and upstream regulatory sequoices are inserted into 

25 the approfMiate expressicxi vector, no additional transcriptional or translational control signals may be 
needed. However, in cases where only coding sequence, or a fragment thereof, is inserted, exogenous 
translational control signals including an in-frame ATG initiation codon should be provided by the 
vect(x. Exogenous translational dements and initiation codons may be of various origins, both natural 
and synthetic. The effidency of expression may be enhanced by the indusion of enhancers appropriate 

30 foe the particular host cdl system used. (See. e.g., Sdiarf, D. et al. (1994) Results ProbL Cefl Differ. 
20:125-162.) 

Methods v^*ich are wdl known to those skilled in the art may be used to construct expression 
vectors containing sequences encoding PRTS and appropriate transcriptional and translational control 
dements. These m^hods include in vitro recombinant DNA techniques, synthetic techniques, and in 
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vivo geanstic recombinatiQiL (See. eg., Samhrook, J. et aL (1989) Molecular Cloning. ALabcyatorv 
MamiaL Cold Spring Harbcff Press, Plainview NY, ch. 4, 8. and 16-17; Ausub^ F.M. et aL (1995) 
Current Protocols in Molecular Biology. John Wney & Sons, New York NY, ch. 9, 13, and 16.) 

A variety of expression vector/bost systems may be utilized to contain and express sequences 
5 encoding PRTS, These iiKlude, but are not limited to, microorganisms sudi as bacteria tra 

with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with 
• ' yeast expression vectors; insect cdl systems infected with viral expression vectors (e.g., baculovirus); 
plant cell systems transfcMmed with viral expression vectors (e.g., cauliflower mosaic virus, CaMV, or 
tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or 

10 animal cell systems. (See, e.g., Sambnx)lc supra : Ausubel, supra : Van Heeke, G. and S.M. Schuster 
(1989) J. Biol. Chem. 264:5503-5509; Engelhard, E.K. et al. (1994) Proc, Natl. Acad. Sd. USA 
91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO 
J. 6:307-311; The McGraw HiD Yearbook of Science and Technology (1992) McGraw FMl, New 
York NY, pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad, ScL USA 81:3655-3659; and 

15 Harrington, JJ. ^ al. (1997) Nat Genet 15:345-355.) Expressicm vectors derived from retroviruses, 
adenoviruses, or herpes or vaccinia viruses, or from various bacterial plasmids, may be used for 
delivery of nucleotide sequences to the targeted organ, tissue, or ceil population. (See, e.g., Di 
Nicola, M. et al. (1998) Cancer Gen. Ther. 5(6):350-356; Yu, M, et al. (1993) Proc. Natl. Acad. ScL 
USA 90(13):634O-6344; BuHct, R.M. et aL (1985) Nature 317(6040):813-815; McGregor, D.P. et al. 

20 (1994) MoL Immunol. 31(3):219-226; and Verma, IJ4. and N. Somia (1997) Nature 389:239-242.) 
The inv^tion is not limited by the host ceQ caq)loyed. 

In bacterial syst^ns, a nuznber of clooiiig and e^qiression vectors may be sdected dep^Hliiig 
upon the use intended for polynucleotide sequences encoding PRTS. For exRsnplei, routine cloning, 
subcloning, and propagation of polynucleotide sequences encoding PRTS can be adbieved using a 

25 multifunctianal E. coll vector such as PBLUESCRIPT (Stratagene, La JoUa CA) or PSPORTl plasmid 
(Life Techoolo^es). Ligation of sequences encoding PRTS into the vector's moiltiple cloning site 
disrupts the ladL gcs^ allowing a colodmetric screening procedure for identification of transformed 
bacteria containing recombinant molecules. In addition, these vectors may be us^iil for m.^^^ 
transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of nested 

30 (^aions in the cloned sequence. (See, e.g.. Van Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 
264:5503-5509.) Whsn large quantities of PRTS are needed, e.g. for the production of antibodies, 
vectors which direct high levd expression of PRTS may be used. For example, vectors containing the 
strong, inducible SP6 or T7 bacteriophage promoter may be used. 
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Yeast expression systens may be used for productioii of PRTS, A number of vectors 
containing constitutive or inducible promoters, sudti as alpha factcs*, alcdbol oxidase, aiKi PGH 
promoters, may be used in the yeast Sacttoromvces cerevisiae or Picbia pastoris .* In addition, such 
vectors direct either the secreticm or intracellular rdsntion of expressed proteins and enable integration 
5 of foreign sequences into the host genome for stable propagation. (See^ e.g., Aasubd, 1995, supra ; 
Bitter, G.A. et al. (1987) Methods EnzymoL 153:516-544; and Scorer, C.A. et al. (1994) 
BicVTechnolpgy 12:181-184.) 

Plant systems may also be used for expression of PRTS. Transcription of sequences eocoding 
PRTS may be driven by viral promoters, eg., the 35S and 19S promoters of CaMV used alone or in 

10 comhinatian with the omega leader sequence from TMV (Takamatsu, N. (1987) HMBO J. 6:307-31 1). 
Alternatively, plant promoters sudi as the small subunit of RUBISCO or heat shock promoters may be 
used. (See, e.g., Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; BrogUe. R. et aL (1984) Science 
224:838-843; and Winter, J. ^ aL (1991) Results ProbL Ceil Differ. 17:85-105.) These constructs can 
be introduced into plant cells by direct DNA transfcvmation or pathogen-mediated transfection. (See, 

15 e.g.. The McGraw Hill Yearbook of Science and Technologv (1992) McGraw ffin. New York NY, pp. 
191-196.) 

In mamnifliiaTi cells, a number of viral-based expression systmis may be utilized. In cases 
wtoere an adenovirus is used as an esqxression vector, sequences encoding PRTS may be ligated into an 
adenovirus transcription/traiislation conq>lex consisting of ttie late promoter and tripartite leader 

20 sequence. Insertion in a non-essential El or £3 region of the viral genonie may be used to obtain 
infective virus whic^ expresses PRTS in host cells. (See, e.g., Logan, J. and T. Shsnk (1984) Proc. 
Nad. Acad. ScL USA 81:3655-3659.) In addition, transcrqjtion enhancers, sudi as the Rous sarcoma 
virus (RSV) enhancer, may be used to increase expression in mammalian host ceUs. S V40 or EB V- 
based vectors may also be used for higlt-levd protein eoqiression. 

25 Human artificial chromosomes (HACs) may also be employed to deliver larger fragm^its of 

DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 
constructed and d^vered via conventional de^very mi^hods (liposomes, pdycationic amino polymers, 
or vesicles) for therapeutic purposes. (See, e.g., Harrington, J.J. et al. (1997) Nat Genet 15:345-355.) 
Fcr long term production of recombinant proteins in mamTinaiiaTi systems, stable expression of 

30 PRTS in cdl lines is preferred. For example, sequences encoding PRTS can be transformed into cell 
lines using expression vectors which may contain viral origins of replication and/or endogenous 
expression el^nents and a selectable marker gene on the same or on a separate vector. Following the 
introductionof the vector, ceUs may be allowed to grow for about 1 to 2 days in enriched media before 
being switched to selective media. The purpose of the sdectable marker is to confer resistance to a 



39 



wo 01/98468 



PCT/USOl/19178 



selective agent, and its preseoce allciws growth and recovery of ceUs vMch successfully express the 
introduced sequences. Resistant clones of stably transformed ceDs may be prtjpagated using tissue 

culture techniques appropriate to the cdl type. 

Any number of sdectionsystans nmy be used to recover transfonnedcdll^ Ttese include, 

5 but are not limited to, the lierpes simplex virus thymidine kinase and adenine phosphcxibosyltransferase 
genes, f ot use in tftr and apr cdls, respectively. (See. e.g., Wigler, M ^ aL (1977) CeiQ 11:223-232; 
Lowy, L et aL (1980) CeSl 22:817-823.) Also, antimetaboUte, antibiotic, ot herbicide resistance can be • • 
used as the basis for selection. For example, dhfr confers resistance to methotrexate; neo confers 
resistance to the aminoglycosides neomycin and G-418; and als and pat confer resistance to 

10 dflcffsultoon and phosphinctocinacetyltransferase, respectively^ (See, e.g., Wigjer, M. et aL (1980) 
Proc. Natl. Acad. Sci. USA 77:3567-3570; Cdlbere-Gar^in, F. et al. (1981) J. MoL Bi6L 150:1-14.) 
Additi(xial selectable genes have l)een described, e.g.. trpB and hisD, which alter ceUular requirements 
fOT metabolites. (See, e.g., Hartman, S.C. and R.C, Mulligan (1988) Proc. Nafl. Acad. Sd. USA 
85:8047-8051.) \^sible markers, e.g., anthocyanins, green flucresc^ proteins (GFP; Clontech), 6 

15 glucuronidase and its substrate B-ghicuronide, or luciferase and its substrate luciferin may be used. 
These markeis can be used not only to idaitiiytransforoiants, but also to quantify the amount of 
transient cr stable protein expressiCHi attributable to a specific vector systan. (See, e.g., Rhodes, C.A. 

(1995) Methods Mol. Biol. 55:121*131.) 

Although the presence/absence of marker gene expression suggests that the gene of interest is 

20 alsopreseot, the presMice and expression of the gMie may need to be confinned. For exan?)le, if the 
sequence encoding PRTS is inserted within a marker gene sequence, transformed cells containing 
sequences encodmg PRTS can be identified by the absence of marker gene function. Altemativ^y, a 
marker gene can be placed in tandem with a sequence encoding PRTS under ttie control of a single 
promoto-. Expression of the marker gene in response to induction or sdection usually indicates 

25 expression of the tandem gene as wefl. 

In general, host cells that contain the nucl^c acid sequence encoding PRTS and that express 
PRTS nmy be identified by a variety of procedures Imown to those of skin in the art These procedures 
include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PGR anq)lification, and 
proton bioassay or immounoassay techniques which include membrane, solution, or chip based 

30 technologies fw the detection and/or quantificaticm of nucleic acid or protean sequences. 

Immunological mrthods for detecting and measuring the expression of PRTS using either 
specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques include 
enzyme-linked imnmnosQrt>Mit assays (ELISAs). radioimmunoassays (RIAs), and fiuorescence 
activated cell sating (FACS). A two-site, monoclonal-based inmuinoassay utilizing monoclonal 
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antibodies reactive to two non-interfering epitopes on PRTS is prefexred, but a competitive binding 
assay may be enq}loyed. These and other assays are weil known in the art (See. eg.. Han^Jton, R. et 
al. (1990) Serological M^faods, a Laboratarv Manual . APS Press, St Paul MN, Sect IV; Coligan. J£. 
d aL (1997) Current Prctocols in Tm nmnnlnp^Y. Greoie Pub. Associates and Wfley-Interscience. New 
York NY; and Pound, J.D. (1998) TTnTminnrhftmi cal Protocols, Humana Press, Totowa NJ.) 

Awidevariety of labels and conjugation techniques are known by those skilled in the art and 
may be used in various nudeic add and amino add assays. Means for produdng labeled hybridization 
cr PGR probes f ot drtecting sequeaoces rdated to pdynudeotides encoding PRTS indude oligolabeling, 
nick translation, end-labeling, or PGR aii^>lification using a labeled nudeotide. Alternatively, the 
sequences encoding PRTS. or any fragments thereof, may be doned into a vectcff for the production of 
an mRNA probe. Such vectors are known in the art, are comnserdally available, and may be used to 
synthesize RNA probes in vitro by addition of an appropriate RNA polymerase such as T7. T3, or SP6 
and labded nucleotides. These procedures may be conducted using a variety of commercially available 
kits, such as those provided by Amersham Phannada Biotech, Promega (Madison WI), and US 
BiodiemicaL Suitable repater nK>lecules or labds v^di may be used for ease of d^ection include 
radionudides, enzymes, flucxescent, chemiluminescent, or chromogenic agents, as well as substrates, 
cofactors, inhibitCH-s. magnetic partides, and the like. 

Host cells transformed with nucleotide sequences ^icoding PRTS may be cultured under 
conditions suitable for the expression and recovery of the protein from cdl culture. The protein 
produced by a transformed cdl may be secreted or r^ained intracdlularly depending on the sequence 
and/or the vectOT used. As win be understood by those of skin in the art expression vectms containing 
polynucleotides whidi encode PRTS may be designed to ccMitain signal sequences whidi direct secretion 
of PRTS through a prokaryotic or eukaryotic cdl membrane. 

In additiost a host cdl strain may be chosen for its ability to modulate expressicm of tiie 
inserted sequeiK^es or to process the expressed protein in the desired fes^ Such nxxtifications of the 
polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, 
lipidation, and acylation. Post-translatiooal p-ocessing which cleaves a "prepro" or ''pro" form of the 
protein may also be used to specify protdntargding, folding, aiid/or activity^ Different host cells 
whidi have specific cellular madiinery and characteristic mechanisms for post-translatiooal activities 
(eg., CHO. HeLa, MDCK, HEK293, and WI38) are available from the American Type Culture 
Conection (ATCC, Manassas VA) and may be chosen to oasure the COTrect modification and processing 
of the f(Xdgn protein. 

In another embodimrat of the invention, natural, modified, or recombinant nucldc add 
sequences encoding PRTS may be ligated to a heterologous sequecce resulting in translation of a fusion 



41 



wo 01/98468 PCT/USOl/19178 

protein in any oftheaforemaitioned host systons. Fot exan5>le, a cbimeric PRTS protein contain 

haerologojis moi^ that can be recognized by a commerciany available antibody may faciHtate tbe 

SCTeoiing of p^tide libraries to inhibitors of PRTS activity. H^erologous protein and peptide 

moieties may also iacilitate purification of fusion proteins using commerciany available affinity 

5 matrices. Such moieties include, but are not limited to, gjutathione S-transferase (GST), maltose 

binding prc^ (MBP). thioredoxin (Trx), calmodulin binding peptide (CBP), 6-His, FLAG, c-myc. and 

hemagglutinin (HA). GST, MBP. Trx, CBP. and 6-His enable purification of tbear cognate fiision 

prot^ on immobilized glutathione, maltose, phenylarsine oxide, calmodulin, and metal-cheiate resins, 

respectively. FLAG, c-myc, and hanagglutinin (HA) enable immunoaffinity purification of fusion 

10 proteins using commercially avaflable monoclonal and polyclonal antibodies that specifically recognize 

these q}itope tags. A fusion protein may also be engineered to contain a proteolytic cleavage site 

located between tiie PRTS encoding sequaice and the h^ologous protein sequence, so tiiat PRTS may 

be cleaved away firom the h^ologous moiety foJlowing purification. M^hods for fusion protean 

expression and purification are discussed in Ausubeil (1995, supra, ch. 10). A variety of commercially 

15 available kits may also be used to facilitate expression and purificaticm of fusion proteins. 

In a fiirtho- ^nbodiment of the invaition. synthesis of radiolabded PRTS may be achieved in 

viao using tiieT>rr rabbit reticulocyte lysatea- wheat gem Thesesystans 

couple transcription and translation of jM-otein-coding sequences operably associated with the T7, T3. cr 

SP6 promoters. Translation takes place in the presence of a radiolabeled amino acid precursOT 
20 exan9>le, ^^S-methionine. 

PRTS of the present invention or fragments tiiereof may be used to screen for compounds that 
specifically bind to PRTS. At least one and up to a plurality of test compounds may be screened for 
specific binding to PRTS. Examples of test compounds include antibodies, oUgonucleotides, proteins 
(e.g., rec^tors), or small molecules. 
25 In one embodiment, the compound tims identified is closely related to die natural Ugand of 

PRTS, e.g., a ligand or fragment thereof, a natural substrate, a stiiactural or functional munetic, or a 
natural binding partner. (See, e.g., CoUgan. J£, et al. (1991) Cuxrent Protnco ls in Tmrnnnr^iopY 1(2): 
Chapter 5.) Similarly, the compound can be closely related to the natural receptor to which PRTS 
binds, CO- to at least a fi^gment of the receptor, e.g., the ligand binding site. In either case, the 
30 compound can be rationafly designed using known techniques. In one embodiment, screening for 
tiiese compounds involves producing appropriate cells which express PRTS, either as a secreted 
protein or on th& cell membrane. Preferred cells include cells from mammals, yeast, Drosoohna . or E, 
coH. Cells expressing PRTS or ceD membrane fractions which contain PRTS arc Ihen contacted with 
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a test compound and binding, stimtilation, or inhibition of activity of eitber PRTS or tbe compound is 
analyzed. 

An assay may simply test binding of a test compound to the pplypeptide. wherein binding is 
detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. For exan^)le> 
the assay may comprise the steps of combining at least one test compound with PRTS, either in 
solution or affixed to a solid support and detecting the binding of PRTS to the compound. 
Alternatively, the assay may detect or measure binding of a test compound in the presence of a 
labeled conq>etitor. Additionally, the assay may be carried out using ceD-fiee prqparations, chemical 
libraries, or natural product mixtures, and the test conqx>und(s) may be fiee in solution or affixed to a 
solid support 

PRTS of the present invention or fragments thereof may be used to screen for compounds that 
modulate die activity of PRTS. Such compoimds may include agonists, antagonists, or partial or 
inverse agonists. In one embodiment an assay is performed under conditions permissive for PRTS 
activity, wherein PRTS is combined with at least one test con^xxmd, and the activity of PRTS in the 
presence of a test compound is con^ared with the activity of PRTS in the absence of the test 
compound. A change in the acdvity of PRTS in the presence of the test compound is indicative of a 
compoundthatmodulatestheactivity of PRTS. Altemativ^y, a test compound is combined with an in 
\afro or cell-free system comprising PRTS under conditions suitable for PRTS activity, and the assay is 
perfcsmed. hi either of th^e assays, a test confound ^lich nKxiuiates the activity of PRTS may do so 

indirecfly and need not come in direct contact with the test conQ>oundL At least one and up to a plurality 
of test compounds may be screened. 

In another embodiment polynucleotides encoding PRTS or thdur mammalian homologs may be 
"knocked out*' in an animal ukx^ system using homologous recombination in embryonic stem (ES) 
cells. Such techniques are wen known m the art and are useful fOT the generation of aniin^ 
human disease. (See, e.g., U.S. Patent Number 5,175,383 and U.S. Patent Number 5,767,337.) For 
example, mouse ES cdls, such as the mouse 129/SvJ ceD line, are derived frcm the early mouse embryo 
and grown in culture. TlieES cells are transformed with a vectw* containing the geme of interest 
disrupted by a marker gene, e.g., tte neomycin phosphotransferase geaie (neo; C^>eccht M.R (1989) 
Scieaice 244: 1288-1292). The vector int^ates into the corresponding region of the host geaK)me by 
homologous recombination. Alternatively, honK>logous recombination takes place using the Cr&-loxP 
systCTi to knockout a gene of interest in a tissue^ or developmental stage-specific manner (Marth, J.D. 
(1996) Clia Invest 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330). 
Transformed ES cells are identified and microdnjected into mouse cell blastocysts such as those fi-om 
the C57BL/6 mouse strain. Tbe blastocysts are surgically transferred to pseudopr^nant dams, and the 
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resultiiig chimenc progeny are geoGlyped ami 

Transgenic animals thus generated may be tested with potential ttierapeiitic or toxic agents. 

PQlynudeotides encoding PRTS may also be manipulated in vitro in £S cells derived from 

human blastocysts. Human £S cells have the potential to dl&arentiateiisto at least eight sepa^ 
5 lineages including endoderm^ mesodenn, and ectodermal cdl types. These cell lineages differentiate 

into, for example^ neural cells, hematc^Kuetic lineages, and cardiomyocytes (Tbomson, J.A. et aL (1998) 

Science 282:1145-1147). 

Polynucleotides encoding PRTS can also be used to create "knoddn" humanized animals (pigs) 

or transgenic animals (mice or rats) to model human disease. With knoddn tecfandlQgy, a r^on of a 
10 polynucleotide encoding PRTS is injected into animal £S ceUs, and the injected sequence integrates into 

the animal cell genome. Transformed cells are injected into blastulae, and the blastulae are inQ}lanted 

as described above. Transgenic progeny or inbred lines are studied and treated with potoitial 

pharmaceutical agents to obtain infcrmatiQn on treatment of a human diseasew Alternatively, a mammal 

inbred to overexpress PRTS, e.g„ by secreting PRTS in its milk, may also serve as a convenient source 
15 of that prot^ (Janne, J. et al. (1998) Biotechnd. Annu. Rev. 4:55-74). 

THERAPEUTICS 

PRTS are useful for hydrolyzing p^tide lx>nds. Chemical and structural similarity, e.g., in 
the context of sequences and motifs, exists between regions of PRTS and proteases. In addition, the 
expression of PRTS is dos^y associated with hemic, neurological, reproductive, endocrine, 
* 20 urogenital, diseased, traatocarcinoma, and tumorous tissues,. Therefore, PRTS appears to play a role 
in gastrointestinal, cardiovascular, autoimnmne/i nfi ammatory, cell proliferative, developmental, 
^ithelial, neurological, and reproductive disorders. In the treatment of disOTders associated with 
increased PRTS expression or activity, it is desirable to decrease the expression or activity of PRTS. 
In the treatment of disorders associated with decreased PRTS esqiression or activity, it is desirable to 
25 increasetheeaqiressionQr activity of PRTS. 

Iberef ore, in one emtxxMment, PRTS or a fragment or derivative thraeof may be administered 
to a subject to treat or prevent a disorder associated with decreased esq^ression or activity of PRTS. 
Examples of such disorders include, but are not limited to, a gastrointestinal disorder, such as 
<^^hagia, pq)tic esqphagitis, esc^ihageal spasm, esophageal stricture^ eso|^iageal cardnoma, 
30 dyspepsia, indigestion, gastritis, gastric caroncnna, anorexia, nausea, emesis, gastroparesis, antral or 
pyloric edema, abdominal angina, pyrosis, gastroenteritis, intestinal obstruction, infections of the 
intestinal tract, p^tic ulcer, cholelithiasis, cholecystitis, cholestasis, pantzeatitis, pancreatic carcinoma, 
biliary tract disease, hepatitis, hyperbilirubinemia, dirhosis, passive congestion of tbeUver, hepatoma, 
infectious colitis, ulcerative colitis, ulcerative proctitis, Crohn's disease, Whipple's disease, Maflory- 
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Wdss syndrome, colonic cardnama, colonic obstruction, irritable bowel syndrome, short bowrf 
syndTOTie, dianiiea, constipation, gastrointestinal hemorrliage, acquired immunodeficiency syndrome 
(AIDS) enteropathy, jaundice, hepatic encephalopathy, hq)atOTenal syndrome, hepatic steatosis, 
henaochromatosis, Wilson's disease, alpha^-antitrypsin deficimcy, Reye's syndrome, primary sclerosing 
cholangitis, liver infarction, portal vein obstruction and thrombosis, centrflobular necrosis, pdiosis 
hepatis, h^atic vein thrombosis, veno-occlusive disease, preeclanopsia, eclanq)sia, acute fatty liver of 
' pregnancy, intrahepatic cholestasis of pr^nancy. and h^atic tuDMrs inchiding nodular hyperplasias, 
adenomas, and caronomas; a cardiovascular disOTder, such as arteriovenous fistula, atherosclerosis, 
hypertoision, vasculitis, Raynaucf s disease, aneurysms, arterial (fissectidns, varicose veins, 
thrombophldJitis and pUebothrombosis, vascular tumcKs, and con5)licaticHis of thrombolysis, balloon 
angioplasty, vascular rq>lacement, and COTonary artery bypass graft surgery, congestive heart failure, 
ischemic heart disease, angina pectoris, myocardial infarction, hypertensive heart disease, d^enerative 
valvular heart disease, calcific aratic valve stenosis, congemtaHy bicuspid aortic valve, mitral annular 
caldfication, mitral valve jH-olapse, rheumatic fever and rheumatic heart disease, infective endocarditis, 
nonbacterial thrombotic endocarditis, endocarditis of systemic lupus erythematosus, carcinoid teart 
disease, cardiomyopathy, myocarditis, pericarditis, neoplastic heart disease, cangeaiital heart disease, 
and con^jlications of cardiac transplantation; an autmnmune/inflanmaatory diso-der, such as acquired 
immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, anergics, 
ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, atherosclerotic plaque rupture, 
autcMnmmne hemolytic anonia, autoinmnine thyroiditis, autoimnnmepolyendocrincpathy-candidiasis- 
ectodermal dystrc^hy (APECED), bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic 
dermatitis, dennatcxnyositis, diabetes mdlitiis, emjAysema. episodic lyiEq)hopenia with 
lymphocytotoxins, erythroblastosis f«alis, erythema nodosum, atrophic gastritis, glonterulonephritis, 
Goo<%>asture's syndrome, goat. Graves' disease. Hashimoto's tiiyroitfitis, hypereosinophflia. irritable 
bowd syndrome, multq>le sclerosis, myasttona gravis, myocardial ot pericardial inflammation, 
osteoarthritis, d^radation of articular cartilage, osteoporosis, pancreatitis, polymyositis, psoriasis, 
Reiter's syndrome, rheumatoid arthritis, sderodenna, Sjogren's syndrcMne, systemic an^hylaxis, 
systemic lupus eryttiMnatosus, systemic sclerosis, thrombocytopoaic purpura, ulcerative colitis, uveitis, 
Wema- syndrome, con^jlications of canca-, hemodialysis, and extracorporeal circulation, viral, 
bacterial, fungal, parasitic protozoa], and hetarinthic infections, and trauma; a cell prcilifexative 
disOTder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hq>atitis, mixed 
connective tissue disease (MCTD). myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycyttiemia 
vera, psoriasis, primary thrombocythania, and cancers including adOKcarcinoma, leukemia, 
lymphoma, n^lanoma, myeloma, sarcoma, teratocardnoma, and, in particular, cancers of the adrenal 
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gland, bladder, bone, bane marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, 
heart, tddney, liver, lung, nouscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 
sple^, testis, thymus, thyroid, and uterus; a devdopmental disc^-der, such as renal tubular acidosis, 
anemia, Cushing's syndrome^ achondroplastic dwarfism, Ducbenne and Becker muscular dystrophy, 
bone resQq>tion, epilq)sy, gonadal dtysgenesis, WAGR syndrome (Wilms* tumor, aniridia, genitourinary 
abnormalities, and mental retardation), Smith-Magenis syndrome, mydodysplastic syndrome, 
hereditary mucoepithdial dysplasia, hereditary keratodermas, hereditary neuropathies such as Charcot- 
Marie-Tooth disease and neurofibromatosis, hypothyroidism, hydrocephalus, seizure discrdm sudi as 
Syndenham's chorea and ceret^ral palsy, spina bifida, anenoephaly, cranioracfaischisis, congenital 
glaucoma, cataract, age-rdlated maailar degeneration, aiKi sensorineural hearing loss; an epithelial 
disorder, such as dyshidrotic eczema, allergic contact dermatitis, keratosis pilaris, mdasma, vitiligo, 
actinic keratosis, basal cell carcinoma, squamous cell carcinoma, seboniieic keratosis, folliculitis, 
herpes simplex, herpes zoster, varicella, candidiasis, dermatophytosis, scabies, insect bites; cherry 
angioma, keloid, dermatofibroma, acrochcrdons, urticaria, transient acantholytic dermatosis, xerosis, 
eczema, atopic dermatitis, contact dermatitis, hand eczema, nummular fi^m\j\^ lichen sin^>lex 
cfaronicus, asteatotic eczema, stasis dermatitis and stasis ulceration, sebcxrhedc dermatitis, psoriasis, 
lichen planus, pityriasis rosea, impetigo, ecthyma, dermatqphytosis, tinea versicolor, warts, acne 
vulgaris, acne rosacea, pen9>higus vulgaris, penphigus foliaceus, paraneoplastic pemphigus, bullous 
pemphigoid, herpes gestatiCHiis, dermatitis herpedformis, linear IgA disease, q>idermolysis bullosa 
acquisita, dermatomyositis, lupus erythematosus, scleroderma and morphea, erythroderma, alopecia, 
figurate skin lesicttis, telangiectasias, hypopigmentation, hyperpigmaitation, vesicles/bunae, exanthems, 
cutaneous drug reactions, p^nilooodular skin lesions, chronic non-healii^ wounds, ptotosensitivity 
diseases, q)idermctysis bullosa sin^lex, epidermolytic hyperkeratosis, q>idermolytic ami 
Donq>idermolytic pahiMplantar keratoderma, ichthyosis bullosa of Siemens, ichthyosis exfoliativa, 
keratosis palmaris et plantaris, keratosis palmoplantaris, palmoplantar keratoderma, keratosis punctata, 
Meesmann's corneal dystrophy, pachyonychia congeauta, white sponge nevus, steatocystoma multiplex, 
epidermal nevi/epidennolytic hyperkeratosis type, monflethrix, trichothiodystrophy, chronic 
hq)atitis/cryptogemc cirrhosis, and colcx-ectal hyperplasia; a neurological disorder, such as epilepsy, 
ischemic cerebrovascular disease, stroke, cerebral neoplasms. Alzhomer's disease. Pick's disease, 
Huntington's disease, d^nentia, Parkinson's disease and other extrapyramidal disorders, amyotrophic 
lateral sclerosis and other motor neuron disorders, progressive neural muscular atrc^y, retinitis 
pigmentosa, hereditary ataxias, multiple sclerosis and other draiyelinating diseases, bacterial and viral 
meningitis, brain abscess, subdural ^pyema, qjidural abscess, suppurative intracranial 
thrombophlebitis, myelitis and radiculitis, viral coitral nervous system disease, prion diseases including 
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kiini, Creutzfddt-Jakbb disease, and Gerstmann-Straussler-Sdbeanker syndrome, fatal familial 
insonmia, nutritional and metaboHc diseases of the nervous system, neurofihromatosis, tuberous 
sderosis, ceretoeBoretinal hemangibblastomatosis, encephalotrigeaninal syndrome, mental retardation 
and other deveicjpmaital disorders of the central nervous system including Down syndrome* cerdjral 
palsy, neuroskdeial disorders, autonomic nervous system disOTders, cranial nerve disorders, spinal ccrd 
diseases, muscular dystrophy and other neuromuscular disorders, peripheral nervous system disorders, 
dermatomyositis and polymyositis, inherited, metabolic, mloaine, and toxic myopathies,- myasthenia 
gravis, periodic paralysis, mental disorders including dmxxI, anxiety, and schizophrenic disorders, 
seasonal affective discnrder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive 
dyskinesia, dystonias, paranoid psychoses, postherp^c neuralgia, Tourette's disorder, progressive 
supranuclear palsy, corticobasal d^eneration, and famiUal frontoteoqxMral dementia; and a reproductive 
disOTder. such as infertility, including tubal disease, ovulatwy defects, and eodometiiosis, a disorder of 
prolactin production, a disruption of the estrous cycle, a disruption of the menstiual cyde. polycystic 
ovary syndrome* ovarian hyperstimulation syndrome, an eDdomsHisd or ovarian tumor, a uterine 
15 fitM-oid, autoimmune disOTders, an ectopic pr^nancy, and teratpgenesis; cancer of the breast, fibrocystic 
breast disease, and galactOTrhea; a disruption of spermatogenesis, abnormal sperm physiology, cancer 
of the testis, cancer of the prostate, bemign prostatic hyperplasia, prostatitis, Peyronie's disease, 
inqxjtence. carcinoma of the male breast, and gynecomastia. 

In another embodunent, a vector capable of expressing PRTS or a fragment or derivative 
20 thereof may be administered to a subject to treat ot prevent a disorder associated with decreased 
ejqjression or activity of PRTS including, but not limited to, tiiose described above. 

In a further embodiment, a ccMi^X)SitiCRi comprising a substantially purified PRTS in 
conjunction with a suitable pharmaceutical carrier may be administered to a subject to treat or prevent a 
disordo- associated with decreased expression ot activity of PRTS including, but not limited to, those 
25 provided above. 

In stm another embodiment, an agonist winch modulates the activity of PRTS may be 
administered to a subject to treat cm: preveait a disorder associated with decreased eqjrcssion ot activity 
of PRTS including, but not limited to, those listed above. 

In a further embodiment, an antagonist of PRTS may be administered to a subject to treat or 
prevent a disorder associated with increased expression or activity of PRTS. Exanqjles of such 
disorders include, but are not limited to, those gastrointestinal, cardiovascular, 
autoimmune/^inflammatory, c^ proliferative, devdopm^tal, q)iUieaial, neurological, and rqproductive 
disOTdCTs described above. In one aspect, an antiTxxty vMcAi speafically binds PRTS may be used 
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(tirecfly as an antagonist ot indirectly as a targeting or delivery mechanism for bringing a 
pharmaceutical agent to cells or tissues which express PRTS. 

In an additional embodiment, a vector expressing the conqjlemeat of the polynucleotide 
encoding PRTS may be administered to a subject to treat or prevent a disorder associated witii 
increased e3q)ression or activity of PRTS including, but not limited to, those described above. 

In other embodiments, any of the proteins, antagonists, antibodies, agonists, can?)lemeotary 
sequences, or vectors of the invCT^ioonmy be administered in combination with other ^jpropriate 
therapeutic ageaits. Selection of the appropriate agents fw use in combinaticm therapy may be ma^ 
one of ordinary skill in the art according to conventinnal pharmiir^iHr^Qi princyles. Hie combination 
of therapeutic agents may act synergisticaDy to effect the treatment or prevention of the various 
disorders described above. Using this ^jproacb, one may be able to achieve therapeutic efficacy with 
lower dosages of each agent, thus reducing the potential for adverse side effects. 

An antagonist of PRTS may be produced using methods which are generally known m the art 
In particular, purified PRTS may be used to produce antibodies or to screen hbraries of pharmaceutical 
agents to identify those which specificaUy bind PRTS. Antibodies to PRTS may also be generated 
using methods that are well known in die art Sudi antibodies may include, but are not limited to, 
polyclonal, monoclonal, chimeric, and single chain antibodies. Fab fragments, and fragmaits produced 
by a Fab expression library. Neutralizing antibodies (ie., dK5se which inhibit dimer formation) are 
generally preferred for therapeutic use. 

Fa- tiie production of antibodies, various hosts including goats, rabbits, rats, mice, humans, 
and others may be immunized by injection with PRTS cm: with any fragment ot oligopeptide thereof 
which has immunogenic properties. Depending cm the host species, various adjuvants may be used to 
increase immunological respcMise. Such adjuvants inchide, but are not limited to, Fromd's, nriMral geOs 
such as aluminum hydroxide, and surface active substances such as lysolecitihin, phironic polyols, 
polyanions, peptides, oil emulsions. KLH, and dinitrophaioL Among adjuvants nsed in humans, BCG 
(bacilli Calmette-Gnerin) and COTvi^acterium p arviim are especially preferable. 

It is preferred that the digopqptides, peptides, ot fragments used to induce amibodies to PRTS 
have an amino acid sequence consisting of at least about 5 amino acids, and generafly wiU consist of at 
least about 10 amino adds. It is also preferable that these oligopeptides, peptides, or fragments are 
idaitical to a pOTtion of the aniino add sequaK» of the natural protein. ShcHtstrrtches of PRTS amino 
adds may be fused with those of anottier protein, such as KLH. and antibodies to the chimeric molecule 
may be produced. 

Monoclonal antibodies to PRTS may be prepared using any technique which provides far the 
production of antibody molecules by continuous cdl lines in culture. These indude, but are not limited 
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to. the hybridoma technique, the human B-ceJl hybridoma technique, and the EB V-hybridoma 
technique. (See, e.g.. Kohler, G. ^ al. (1975) Natoire 256:495-497; Kozbcff, D. et aL (1985) J. 
ImmunoL Methods 81:31^2; Cote, R.J. et al. (1983) Proc. NatL Acad. ScL USA 80:2026-2030; and 
Cole, S.P. et aL (1984) Mol. Cefl BioL 62:109-120.) 

In addition, tectaiques develq)ed for the production of "chimeric antibodies," such as the 
splicing of mouse antibody genes to human antibody gcoes to obtain a molecule with appropriate 
antigen specificity and biological activity, can be used. (See, ag.. Morrison, SX. et aL (1984) Proc. 
Natl. Acad. Sd. USA 81:6851-6855; Neuberger, M.S. €t al. (1984) Natare 312:604-608; and Takeda, 
S. et aL (1985) Nature 314:452-454.) Alternatively, techniques described fcr the production of single 
chain antibodies may be adapted, using methods known in the art, to produce PRTS-spedfic single 
chain antibodies. Antibodies with related specificity, but of cfistmct idiotypic con^xjsition, may be 
generated by chain shuffling from random ambmatcxial immunpglobuiin Ubraries. (See, e.g. , Burton, 
D.R. (1991) Proc. NatL Acad. Sd. USA 88:10134-10137.) 

Antibodies may also be produced by inducing in vivo induction in the lyn^jhocyte population 
by screeaing mmmnoglobulm hlsraries or panels of highly specific bhading reagents as disclosed in 
the literatura (See, &g.; Qrlandi, R. aL (1989) Proc. Natl. Acad. ScL USA 86:3833-3837; Winter, 
G. et al. (1991) Nature 349:293-299.) 

Antibody fragments which contain specific buKling sites for PRTS may also Fcr 
exanq>le, such fr agmaits include, but are not limited to, F(ab')2 fragments produced by pepsm digestion 
of the antibody molecule and Fab fragments graierated by reducing tiie disulfide bridges of the F(ab*)2 
fragments. Alternatively, Fab expression Ubraries may be constructed to aHow rapid and easy 
identification of monoclonal Fab fragments with the desired specificity. (See, e.g., Huse, W.D. et aL 
(1989) Science 246:1275-1281.) 

Various immunoassays may be used foe screoiing to identify antibodies having the desired 
specificity. Numerous protocols for COTopetitive binding or imnmnwadiOTietric assays using either 
polyclonal or monoclonal antibodies with established specificiti Such 
immunoassays typically invoive the measurranent of con^)lex formation brtween PRTS and its specific 
antibody. A two-site, monodonal-based immunoassay utilizing monoclonal antibodies reactive to two 
non-interfering PRTS epitopes is generally used, but a conqjetitive binding assay may also be enq)loyed 
(Pound, supral 

Various methods such as Scatchard analysis in conjunction withradioimmunDassay techniques 
may be used to assess the affinity of anhTxxiiesfOT PRTS. AfiBnity is exi^essed as an association 
constant, K,, which is defined as the molar concentration of PRTS-antibocty complex divided by tiie 
molar concentrations of free antigen and free andTxxly imder equihlMium c^ The K. determmed 



49 



wo 01/98468 PCT/USOl/19178 

for a preparation of polyclonal antibodies, \^ch are lieterogCTeous in tbeir affinities for multiple PRTS 
epitopes, rqiresents the average affinity, cr avidity, of the antibodies f<K PRTS, Hie K, determined ftr 
a preparation of monodonal antibodies, whidi are monospecific for a particular PRTS epitope, 
r^esents a true measure of affinity, Hi^affinity antibody preparations with K. ranging from about 
l(f to 10^^ L/mole are preferred for use in immunoassays in which the PRTS-antibody con^ilex must 
withstand rigorous manipulations. Low-affinity antibody prq)arations with K. ranging from about 10® 
to 10 IVmole are preferred for use in immunopiuification and siroflar procedures i?^ch ultimately 
require dissociation of PRTS, pr^erably in active farm, from the antibody (Catty, D. (1988) 
Antibodies. VnhiTnft T- A Practical Approach. IRL Press, Washington DC; liddeD, JJK and A. Cryer 
(1991) A Practical Guide to Monoclonal Antibodies. John Wfley & S<his, New York NY). 

The titer and avidity of polyclonal antibody preparations may be further evaluated to d^ermine 
the quality and suitability of sucii preparations for certain downstream applications. For exan^le, a 
polyclonal antibody preparation ccHxtaining at least 1-2 mg specific antibody/ml, preferably 5-10 mg 
specific antitxxly/ml, is generally employed in procedures requiring precipitation of PRTS-antibody 
complexes. Procedures for evaluating antibody sp)ecificity, titer, and avidity, and guidelines for 
antibody quality and usage in various applications, are generally available. (See, e.g.. Catty, supra, and 
Coligan et al. supra .) 

In another embodiment of the invention, the polynucleotides encodmg PRTS, or any fragment 
or complement thereof, may be tised for ther^)^tic purposes. In one aspect, modifications of gene 
expression can be achieved by designing con^lem^:itary sequences or antisense molecules (DNA, RNA, 
PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding PRTS. 
Such technology is well known in the art, and antisense oligonucleotides cr larger fragments can be 
designed from various locations along the coding or control regions of sequences encodu^ (See, 
e.g., Agrawal, S., ed. (1996) Antisense Therapeutics , Humana Press Inc., TotawaNJ.) 

In ^^apeutic use, any gene delivery syston suitable f ch" introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be delivered 
intracellularly in tiie form of an ejqsression plasmid which, upon transcription* produces a sequence 
compl^entary to at least a (xxtion of the c^lular sequoice encoding the target protein. (See, e.g., 
Sl^, JJE. et al. (1998) J. Aflogy Cli. Immunol. 102(3);469-475; and Scanlon, KLJ. et aL (1995) 
9(1 3): 1 288- 1 296.) Antisense sequences can also be introduced intracellularly through the use of viral 
vectors, such as retrovirus and adeno-assodated virus vectors. (See, e.g.. Miller, AJO. (1990) Blood 
76:271 ; Ausubel, supra : Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347.) Other 
gene delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other 
systems known in the art (See, e.g., Rossi, J J. (1995) Br. Med. BuU. 51(l):217-225; Boado, RJ. et 
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al. (1998) J. Phann, Sd. 87(1 1):1308-1315; and Mcxris, M.C. et al. (1997) Nucleic Acids Res, 
25{14):2730-2736.) 

In another eoibodiment of the invention, polynucleotides encoding PRTS may be used fa: 
somatic or germlineg^ther^y. Gene ther^y may be perfcMined to (i) correct a gen^c deficieancy 
(e.g., in the cases of severe combined immunodeficiency (SCID>-X1 disease characterized by X-linked 
inheritance (Cavazzana-Calvo, M. et aL (2000) Science 288:669-672), severe ccmibinBd 
immii nortfif fc teDcy synrircHne associated with an inherited adenosine deaminase (ADA) deficiency 
(Blaese. R.M. et aL (1995) Science 270:475^0; Bwdignon, C. aL (1995) SciCTce 270:470^75). 
cystic fitorosis (Zabner, J. et aL (1993) Cdl 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene 
Therapy 6:643-666; Crystal R.G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, familial 
hypercholesterolMnia, and hmK5>hilia resulting fircm FactOT vm or Factor DC deficiencies (Crystal, 
R.G. (1995) Scieaice 270:404-410; Venna, LM. and N. Somia (1997) Nature 389:239-242)), (ii) 
express a conditicmally lethal gene product (e.g. , in the case of cancers wMcIl result from unr^ulated 
cell proliferation), «• (iii) express a prot^ which affords protection against intracellular parasites (e.g., 
against human r^oviruses, sudh as human immunodeficiency virus (HTV) (Baltimore, D. (1988) 
Nature 335:395-396; PoescWa, E, &t al, (1996) Proc. NatL Acad. Sd. USA 93:11395-11399), 
hq)atitis B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and ParacocddiQides 
brasiliensis; and protozoan parasites such as Plasmodjiim fairip amm and Tr yp anaRnma miyn Inttie 
case wtoacc a genetic deficiency in PRTS expression or r^ulation causes disease, the expression of 
PRTS firom an apps^opriate population of transduced cells may alleviate the clinical manifestations 
caused by the genetic deficiency. 

In a further embodiment of the inv^ition, diseases or discarders caused by deficiencies in PRTS 
are treated by constructing mammalian expression vectcffs encoding PRTS and introducing these 
vectors by mechanical means into PRTS-defident cdls. Mechanical transfer technologies for use with 
CCDs in vivo ch* ex vitro indude 0) direct DNA microinjection into individual ccHs, (ii) ballistic gold 
partide ddivery. (iii) Uposome-mediated transfection. Civ) receptor-medialed gene transfer, ami (v) the 
use of DNA transposons (Mcygan, Rj\. and W.F. Anderson (1993) Amm. Rev. Biodiem. 62:191-217; 
Ivies, Z. (1997) Cdl 91:501-510; Boulay. J-L. and H. R^cqwn (1998) Curr. Opin. BiotechnoL 9:445- 
450). 

Expression vectMS that may be effective for the e3q>ression of PRTS indude, but are not 
limited to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX vectors (Invitrogen, Carlsbad CA), 
PCMV-SCRJPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jofla CA), and PTET-OFF, 
PTET-ON. PTRE2, PTRE2-LUC. PTK-HYG (aontech, Palo Alto CA). PRTS may be expressed 
using (i) a constitutivdy active promoter, (e.g., fi-om cytomegalovirus (CMV). Rous sarcoma virus 
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(RS V), S V40 vims, thymidine kinase (TIQ, or ^actin genes), (ii) an inducible promotei {e.g., the 
tetracydinehregulated promoter (Gossoi, M. and H. Bujard (1992) Proc. Nafl. Acad. Sd. USA 
89:5547-5551; Gossen, M. et al (1995) Science 268:1766-1769; Rossi. FJMV. and H.M. Blau (1998) 
Curr. Opin. BiotechnoL 9:451^56), commerciany available in ttie T-REX plasmid (Invitrogen)); the 
ecdysone-indua-ble promoter (available in the plasmids PVGRXR and FIND; Invitrogen); the 
FK506/rapamycin indudble promoter; cr the RU486/mifepristone inducible promota- (Rossi, F.M. V. 
and Blau, H.M. sunra)). or (iii) a tissue^specific promoter or the native prcnnoter of tte aidogeaxxis 
gene encoding PRTS from a nonnal individual. 

Commerdaliy available liposome transformatira kits (e.g., the PERFECT LIPID 
TRANSFECnON KIT, available from Invitrogen) allow one with ordinary skin in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 
parameters. In the alternative, transfonnation is perfmned using the calcium ptx^ 
(Graham, FX. and A.J. Eb (1973) Virology 52:456^7), or by dectroporation (Neumann, E. et al. 
(1982)EMBOJ. 1:841-845). TheintroductionofDNA to primary cdls requires modification of these 
standardized mammalian transfection protocols. 

In another embodimait of the invention, diseases or disorders caused by gtt^c^^ with 
respect to PRTS expression are treated by constructing a retrovirus vector consisting of (i) the 
polynucleotide encoding PRTS under the control of an indq)endait promoto- or the retrovirus long 
terminal rq)eat (LTR) promoter, (ii) ^ropriate RNA packaging signals, and (iii) a Rev-responsive 
elemait (RRE) along with additional retrovirus cis-Bctrng RNA sequaices and coding sequences 
required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are 
commercially available (Stratagene) and are based cm published daU (Riviere, I. et aL (1995) Proc. 
Nafl. Acad. Sci. USA 92:6733-6737), inco^jorated by reference herein. Tbe vector is propagated in an 
^rc^ate vector producing ceJl line (VPCL) that aqpresses an envelope gene with a tropism for 
recqptors cMi the targrt ceils a promiscuous envelope jH^otean such as VSVg (Armentano, D. et al. 
(1987) J. ViroL 61:1647-1650; Bender, M.A. et aL (1987) J. Virol. 61:1639-1646; Adam, M.A. and 
A.D. Miner (1988) J. Virol. 62:3802-3806; DuU T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. 
et al. (1998) J. ViroL 72:9873-9880). U.S. Patent Number 5,910,434 to Rigg ("Method for obtaining 
re*rovirus packaging cell lines producing high transducing efficiency r^oviral supernatant") discloses a 
method fa: obtaining retrovirus packaging cefl lines and is her reference. 
Propagation of retrovirus vectors, transduction of a population of cells {e.g.. CD4* T^c^). and the 
return of transduced cdls to a patient are procedures well known to persons skiDed in the art of geaie 
therapy and have be^ well documented (Ranga, U. et al. (1997) J. Virol. 71:7020-7029; Bauer, G. ^ 
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aL (1997) Blood 89:2259-2267; Boityhadi, MX. (1997) J. Virol. 71:4707^716; Ranga, U. et al. 
(1998) Proc. NatL Acad. ScL USA 95:1201-1206; Su. L. (1997) Blood 89:2283-2290). 

In the altenative, an adenovinis-based goje ther^y ddiveiy system is used to deliver 
polynucleotides encoding PRTS to ceDs whicii have one ot more gemic abncrmalities with respect to 
the e3q)ression of PRTS. The constructicm and packaging of adenovirus-based vectors are wdl known 
to those with ordinary skill in the art Rq>lication defective adenovirus vectcrs have proven to be 
vers atile for in^xHling genes oicoding immunaregulatcay proteins into intact islets in the pancreas 
(Cs^ M.E. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are 
described in U.S. Patent Number 5,707.618 to Annentano ("Adenovirus vectors for gene therapy'O, 
hereby incrapOTated by reference. For adenoviral vectcys, see also Antinozzi, P.A. et al. (1999) Annu, 
Rev. Nutr. 19*.51 1-544 and Vexma, I.M. ami N. Soma (1997) Nature 18:389:239-242, both 
incorporated by referaice hereia 

In another alternative^ a herpes-based, goie therapy d^very system is used to deliver 
polynucleotides oicoding PRTS to targ^ cdls whidi have one or more genetic abncKmalities with 
respect to the expression of PRTS. The use of hopes siniplex virus (HSV>based vectors may be 
especially valuable fa: introducing PRTS to cdls of the central nervous system, for which HS V has a 
tropism. The construction and packaging of herpes-based vectors are wen known to those with 
o-dinary skill in the art. A replication-con9)^t hopes simplex virus (HSV) type 1-based vectcr has 
been used to delivo a r^xater geaie to the eyes of pranates (Liu, X. ^ aL (1 999) Exp. Eye Res. 
1 69:385-395). The construction of a HS V-1 vims vector has also been disclosed in detail in U.S. 
Patoit Number 5,804,41 3 to Del^uca ("Hopes implex vims strains for geaie transfer"), which is 
hoeby incxapOTated by refoence. U.S. Patent Number 5,804,41 3 teacbes the use of recombinant HSV 
d92 which COTsists of a genome containing at least one exogenous gene to be transferred to a cdl undo 
the control of the appropriate promoter for purposes inchiding human gene therapy. Also tau^ by this 
patent are the constraction and use of recombinant HSV strains dieted far ICP4, ICP27 and ICP22. 
Fa- HSV vectOTs, see also Goins, W.F. et aL (1999) J. MroL 73:519-532 and Xu, H, al. (1994) Dev. 
BioL 163:152-161, hoeby incorporated by reference: Hie manq)ulatian of dcoed herpesvirus 
sequences, the genoation of recombinant vims following the transfection of multiple plasmids 
containing differoit segments of the large herpesvirus geocxnes, the growth and prcpagatitHi of 
herpesvirus, and the infection of cells with hopesvims are techniques well known to those of ordinary 
skill in the art 

In anotho alternative, an alphavims (positive, single-stranded RNA virus) vector is used to 
delivo polynucleotides oKXxiing PRTS to target cells. The bicrtogy of the prototypic a^havirus, 
Sonliki Forest Vims (SFV). has beoi studied extensivdy and gene transfo vectors have been based on 
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the SFV g&iasm (Garoff. H. and K.-J, Li (1998) Curr. Opia BicrtechnoL 9:464^9). During 
alphavirus RN A r^Ucation. a subgeiK)niic RNA is generated tbat normany eiKX)des the viral capsid 
protons. TTJissubgeromic RNA rqjHcates to Mgher levels than the fun iCTg^ 
in the overproduction of capsid proteins relative to the viral proteins with enzymatic activity (e.g.. 
protease and polymerase). Simflarly, inserting the coding sequence to PRTS into the a^havinis 
genome in place of the capsid-coding region results in the production of a large number of PRTS-coding 

RNAsandthesynthesisofhighleveis of PRTS in vectOT transduced ceills. While alphavirus infecticai - 
is typically associated with cdl lysis vwthin a few days, the ab^^ 

hamster ncxmal kidney ce^ (BHK-21) with a variant of Sindbis vims (SIN) indicates that the lytic 
replication of alphavinises can be altered to suit the needs of the gcoe therapy appUcation (Dryga, S. A. 
& aL (1997) Virology 228:74-83). The wide host range of alphavinises win aflow the introduction of 
PRTS into a variety of cell types. The specific transduction of a subset of ceils in a population may 
require the sating of cells priOT to transducticHL The methods of manipulating infectious cDNA clones 
of alpbaviruses, performing alphavirus cDNA and RNA transfections, and performing alphavirus 
infecticms, are well known to tliose with OTdinary skin in the art 

Oligonucleotides derived from the transcription initiaticM site, eg., between about positions -10 
and +10 from the start site, may also be employed to inhibit gene express^ SimOarly. inhibition can 
be achieved using triple helix base-pairing methodology. Triple helix pairing is us^ because it causes 
inhibition of the ability of the double helix to open suf&d^y for tbe binding of polymerases, 
transmption factors, or regulatory molecules. Recent therapeutic advances using triplex DNA have 
been desmbed in the literature. (See. e.g.. Gee, J.E, et al. (1994) in Huber. B JE. and B.I. Cair. 
Moaecular and Immunologic Approaches. Futura Publishing. Mt Kisco NY, pp. 163-177.) A 
conq>lementary sequence oc antisoise molecule may also be designed to block translation of mRNA by 
preventing the transcript from binding to ribosomes. 

Ribozymes. enzymatic RNA mcdecules, may also be used to catalyze the specific cleavage of 
RNA The mechanism of ribozyme action involves sequence-specific hybridization 
molecule to aHiq)lemeotary target RNA, foflowed by endanudeolytic cleavage. Fot exanq)le, 
engineered hammerbead motif ribozyme molecules may specifically and effidentiy catalyze 
endonudeoiytic cleavage of sequences encoding PRTS. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified by 
scanning tiie target molecule fOT ribozyme cleavage sites, including the foDowing sequoices: GUA. 
GUU. and GUC. Once i dentifi ed, short RNA sequences of between 15 and 20 ribonucleotides, 
cOTresponding to the r^on of the target gene ccMitaining tiie deavage site, may be evaluated for 
secondary structural features whidi may render die oligonudeotide inoperablcL Hie suitability of 
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candidate targets may also be evaluated by testing accessibility to hybridization with conplementary 
oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and libozymes of the invention may be prqjared by 
any m^bodknovm in the art for the synthesis of nuddc acid mo^ These include tedtiniques for 

chemically synthesizing oligonudeoddes such as solid phase phosphoramidite chemical synthesis. 
Alternatively, RNA molecules may be generated by in vitro and in vivo transaction of DN A sequ«K:es 
encoding PRTS. Such DNA sequences may be incorp(mted into a wide variety of vectJCffs with sm 
RNA polymerase promoters such as T7 a* SP6. Attematively, these cDNA constructs that synthesize 
coraplCToaitary RNA, constitutivdy ot inducibly, can be introduced into ceil lines, ceBs, or tissues. 

RNA molecules may be modified to increase intracellular stability and half-life. Possible 
modifications include, bm are not limited to, the addition of flanking sequent atthe5'and/or 3-eods 
of the molecule, ot the use of phosphcx-othioate or 2' O-metl^l rather than phosphodiesterase linkages 
within the backbone of the nK>lecul& This concept is inherent in the production of PNAs and can be 
extended in all of these noolecules by the indusicMi of nontraditional bases such as incsine, queosine, and 
wybutosine, as well as acetyl-, methyl-, thio-, ami similarly modified forms of adenine, cytidine, 
guanine, thymine, and uridine which are ncrt as easily recognized by eDdogax>us endonudeases. 

An additional embodiment of the invention encompasses a method for screening for a 
compound which is ^ective in altmng expression of a polynucleotide encoding PRTS. Co]zq>ounds 
which may be effective in allying expression of a specific polynucleotide may include, but are not 
limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming oligonucleotides, 
transcription factors and other polypeptide transcriptional regulators, and non-macromolecular 
chemical entities whic^ are capable of interacting with si)ecific polynucleotide sequaices. Effective 
con^)ounds may alter polynucleotide expression by acting as ^ther inhibitors or promoters of 
polynucleotide expression. Thus, in the treatmrat of disorders associated with increased PRTS 
expression or activity, a compound whic^ specifically inhibits expression of the polynucleotide 
encoding PRTS may be therapeutically useful, and in me treatment of dismlers associated with 
decreased PRTS expression or activity, a con^>ound which specifically promotes expression of the 
polynucleotide encoding PRTS may be ther2q)eutically usefuL 

At least one, and up to a plurality, of test compounds may be screwed for effectivraiess in 
altering expression of a specific polynucleotide. A test compound may l)e obtained by any method 
commonly known in the art, including chemical modification of a compound known to be effective in 
altering polynucleotide expression; selection fiom an existing, conmiercially-available or proprietary 
library of naturally-occurring or non-natural chemical compounds; rational design of a compoimd 
based on chemical and/or stnictural properties of the target polynucleotide; and selection fix>m a 
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library of chemical compounds created combiaatorially or randomly. A sample compnsing a 
polynucleotide encoding PRTS is exposed to at least one test compound thus obtained. Hie sample 
may comprise, for example, an intact or pomeabilized cell, or an in vitro cell-free or reconstituted 
biocdiemical system. Alterations in the expression of a polynucleotide encoding PRTS are assayed by 
any method commonly known in the art Typically, the expression of a specific nucleotide is detected 
by hyMdization with a probe having a nucleotide sequence complementary to the sequence of the 
polynucleotide encoding PRTS. The amount of hybridization may be quantified, thus forming the 
basis for a comparison of the expression of the polynucleotide both with and without exposure to one 
or more test compouiKls. Detection of a change in the expression of a polynucleotide exposed to a 
test conqx>und indicates that the test compound is effective in altering the expression of the 
polynucleotide. A screen for a compound effective in altering esqiression of a specific polynucleotide 
can be canied out, for example, using a Schizosaccharomvces oombe gene expression system (Atkins, 
D. et al. (1999) U.S. Patent No. 5,932.435; Amdt, G,M. et al. (2000) Nucleic Acids Res. 28^15) or a 
human ceH Ime such as HeLa ceH (Qarke, ML. et al. (2000) Biochem. Biophys. Res. Commun. 
268:8-13). A particular embodiment of the present invention involves screening a combinatorial 
library of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide nudeic adds, and 
modified oligonudeotides) for antisense activity against a specific polynucleotide sequ^ice (Bniice, 
T.W. et al. (1997) U.S. Patent No. 5,686,242; Bniice, T.W. et al. (2000) U.S. Patent No. 6,022,691). 

Many m^ods for imroducing vectors into ceSis or tissues are avaOable am equally suitable for 
use in vivo , in vitro , and ex vivo . For CT_wg therapy, vectors inay be iirtroduced into stem cdls taken 
from the patient and donally propagated for autologous transplant back into that same patient 
Delivery by transfection, by liposome injecticHis, or by polycaticHUC amino polymers may be achieved 
using methods which are well known in the art (See, e.g., Goldman, C.K. et al. (1997) Nat 
Biotechnol. 15:462^6.) 

Any of the therapeutic methods described above may be applied to any subject in need of sudi 
therapy, including, for exanQ}le, mammals such as humans, dogs, cats, cows, horses, rabbits, and 
monkeys. 

An additional embodimeot of the inventicxi rdates to tte administration of a aHiqx)sition whidi 
generally comprises an active ingredient fomulated with a pharmaceutically acceptable excipi^ 
Exdpients may include, for example, sugars, starches, ceiluloses, gums, aiK) proteins. Various 
formulations are commonly known and are thoroughly discussed in the latest edition of Remington's 
Pharmaceutical Sdences rMaack Publishing, Kngtnn PA) Such con^)ositions may consist of PRTS, 
antibodies to PRTS. and mimetics, agonists, antagonists, or inhibitcws of PRTS. 
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The compositions utilized in this invention may be administered by any number of routes 
including, but not limited to, oral, intravCTOus, intramuscular, intra-arterial. intramedullary, intratbecal. 
intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal, intranasal, literal, topical, 
sublingual, or rectal means. 

Con?x)sitions for pulmonary administration may be prepared in liquid ot dry powder form. 
These coiiq)ositions are generaUy aerosolized inunediatdypo^ Intbecase 
of sman molecules (e.g. traditional low molecular weight organic drugs), aerosol deiivery of fast-acting 
fcffmulaticHis is w^-known in tite art In the case of macromcrfecules (e.g. larger peptides and prxjtans), 
recent develc^moits in the fieid of pulmonary delivery via the alveolar r^on of the lung have enabled 
the practical delivery of drugs such as insulin to blood circulation (see. e.g.. Patton, J.S. et aL. U.S. 
Patent No. 5,997.848). Pulmonary delivery has the advantage of administration without needle 
injection, and obviates the need for potentially toxic penetration enhancers. 

Conqxjsitions suitable for use in the inv^on include conqxjsitions wherein the active 

ingredims are contained in an effective amount to achieve the intended purpose. The drtermination of 
an effective dose is wea within the capability of those skilled in the art 

Specialized forms of compositions may be pr^ared for direct intracdlular delivery of 
macroroolecules comprising PRTS or fragments thereof. For example, l^joson^ pr^aralions 
containing a cefl-impenneable maCTomolecule may promote cdl fusion and intraceflular ddivery of the 
macromolecule. Alteniatively.PRTSOTafragmcaitthereofmaybejcdnedtoashortcatiomcN^ 
terminal portion from die HIV Tat- 1 protan. Fusion proteins thus generated have been found to 
transduce into the ceils of aU tissues, including the brain, in a mouse model system (Sciiwarze, S.R, eX 
aL (1999) Science 285:1569-1572). 

Fot any ccHi^)ound, the ther^)eutically effective dose can be estunated initially either in cell 
culture assays, e.g.. of neoplastic cefls, car in animal modds such as mice, rats, rabbits, dpgs. monkeys, 
orpigs. An animal model may also be used to d^ennine the ^jprcpriatecon^ 
of administration. Such infOTmati<Mi can tlien be used to determine useM doses and routes for 
administration in humans. 

A therapeuticany effective dose refers to that amount of acdve ingredient, fa- exBsnple PRTS or 
fragmrats thereof, antibodies of PRTS. and agonists, antagonists esc inhibitors of PRTS, which 
amdiorates the symptoms or ccaiditiOT. Therapeutic efBcacy and toxicity may be d^ennined by 
standard pharmaceutical procedures in ceiU cultures or with experimental animals, such as by 
calculating the ED50 (the dose tho-apeuticany effective in 50% of the population) or LD50 (the dose 
lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 
therapeutic index, which can be expressed as the LD50/ED50 ratio. CcMi^itions which exhibit large 
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therapeutic indices are prefeiTed. llie data obtained from cell culture assays aiKi am 
used to fannulate a range of dosage fOTlnimaniise. The dosage contained in sudbi compositions is 
preferably within a range of circulating oonc^trations that includes the ED50 with hmc or 00 toxicity. 
The dosage varies within this range depending upon the dosage form ea?)loyed, the sensitivity of tl^ 
patient, and the route of administration. 

The exact dosage win be determined by the practitioner, in light of factors related to the subject 
requiring treatmoit I^age and administration are adjusted to jM-ovidesuffident levels of tte - 
moiety OT to maintain the desired effect FactOTs which may be taken into account include the seventy 
of the disease state, the general healtti of the subject, the age. weaght, and grader of the subject, time 
and frequency of administration, drug ccMnbination(s), reaction sensitivities, and response to therapy. 
Long-acting compositions may be administered every 3 to 4 days, every week, or biweekly depending 
on the half-life and clearance rate of the particular formulation. 

Noonal dosage amounts may vary from about 0. 1 /zg to 1 00,000 Mg, up to a total dose of 
abom 1 gram, depeaKiing upon the route of adminisfration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art 
Those skiDed in the art win employ different fcrmulations for nucleotides than for proteins or their 
inhibitors. Similarly, delivery of polynucleotides or polypq>tides wfll be specific to particular cells, 
conditions, locations, etc. 
DIAGNOSTICS 

In another embodimait, antibodies which specificany bind PRTS may be used fear the diagnosis 
of disorders characterized by expression of PRTS, or in assays to monitCM- patients being freated with 
PRTS or agonists, antagonists, inhibit»s of PRTS. Antibodies useful for di^nostic purposes may 
be prepared in the same manner as described above for therapoitics. Diagnostic assays far PRTS 
include mrthods which utilize the antibody and a label to detect PRTS in human body fluids ot in 
extracts of censor tissues. The antibodies may be used with or witixrat modification, and may be 
labeled by covalent non-covaleot attachrooit of a repoter molecule. A wide variety of reporter 
mol ecules , several of win are described above, are known in the art and may be used 

A vari^ of protocols foe uteasuring PRTS, including EUSAs, RIAs, and FACS, are known in 
the art and provide a basis fw diagnosing altered or abmsmal levels of PRTS expression. Nmnal ot 
standard values for PRTS expression are established by combining body fluids or ceU extracts taken 
from normal mammalian subjects, for example, human subjects, with antibodies to PRTS under 
conditions suitable fa- conq)lex formation. The amount of standard complex formation may be 
quantitated by various methods, such as photometric means. Quantities of PRTS expressed in subject. 



58 



wo 01/98468 



PCT/USOl/19178 



•'till 



rol, and disease samples from biopsied tissues are compared with the standard vahies. Deviation 
between standard and subject values establishes the parameters fcr diagnosuig disease 

In another embodimojt of the inv^tion, the polynucleotides encoding PRTS may be used for 
diagnostic purposes. The polynucleotides whicdi may be used inchide oligonucleotide sequeaices. 
conqilemaitary RNA and DNA molecules^ and PNAs. Th& polynucleotides may be used to detect and 
quantify go^ expression in biopsied tissues in which e3qM-essicHi of PRTS may be correlated with 
(^seasG. The diagnostic assay may be used to detennine absence, presence, and excess expr^ 
PRTS, and to monitor regulation of PRTS levels during therapeaitic intervention. 

In one aspect, hybridization with PGR probes which are capable of detecting polynucleotide 
sequMices, including genomic sequences, encoding PRTS or closdy related molecules may be used to 
identify nucleic acid sequoices whicii racode PRTS. The specificity of the probe, it is made 

from a highly specific r^cm, e.g., ttie 5* rpgulatwy r^on, or fixm a less specific region, e.g., a 
conserved motif, and the stringeaocy of the hybridization or anq)lification vwH determine whether tiie 
probe idoitifies only naturaHy occurring sequences encoding PRTS, aflelic variants, or related 
sequences. 

Probes may also be used for the detection of rdated sequoices, and may have at least 50% 
sequence id^tity to any of ttie PRTS encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO:22-42 or from 
genomic sequences including promoters, enhancers, and introns of the PRTS gent 

Means fOT producing specific hybridization prc^es fos: DNAs aicoding PRTS include the 
cloning of polynucleotide sequences encoding PRTS or PRTS derivatives into vectc^s for the 
production of mRNAjH^obes. Such vector are laK)wn in the art, are cominerciany available, and may 
be used to syntiiesize RNA iHX)bes in vitro by n^ans of the addition of the appropriate RNA 
polymerases aiKl tise appMropriate labeled nucleotides. Hybridization probes may be labeled by a vari^ 
of repoter groups, for exanq)le, by radionuclides such as ^ or ^S, or by enzymatic labels, such as 
alkaline phosphatase coupled to the probe via avidin/biotin coupling syst^ns, and the like: 

Polynucleotide sequoKXs encodmg PRTS may be used for the diagnosis of disorders associated 
witii expression of PRTS. Exanqjles of such disOTders include, but are not limited to, a gastrointestinal 
diSOTder, such as dysphagia, peptic esophagitis, esophageal spasm, esophageal stricture, esophageal 
carcinoma, dyspepsia, indigestion, gastritis, gastric carcinoma, ancsrexia, nausea, emesis, gash-oparesis, 
antral or pyloric edana, abdommal angina, pyrosis, gastrooiteritis, intestinal obstruction, infections of 
the intestinal tract, pq)tic ulcer, cholelithiasis, cholecystitis, cholestasis, pancreatitis, pancreatic 
carcinoma, biliary tract i!^east, hq)atitis, fayperbilirubinania, cirrhosis, passive congestion of the Uver, 
hepatoma, infectious colitis, ulcerative colitis, ulcerative proctitis. Crohn's disease. Whipple's disease, 
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MaDory- Wdss syndrome, colonic carclnonia, colonic obstruction, irritable bowel syndrome^ short 
bowel syndrome* diarrbea, constipaticm, gastrointestinal hemorrhage, acquired immunodeOciency 
syndrome (AIDS) enterq}athy, jaundice, hepatic encephalopathy, hepatorenal syndrome, hepatic 
steatceis, hemochromatosis, \rason's disease, alphapantitrypsin deficiency, Reye's syndrome, primary 
sderosing cbolangitis. liver infarction, portal vein obstruction and thrombosis, centrilobular necrosis, 
peiiosis h^atis, hepaXic vein thrombosis, vaio-ocdusive disease, preeclai^psia, ecianq)sia, acute fatty 
liver of pregnancy, intralKpatic ctolestasis of pr^nancy, and h^atic tumors including nodular 
hyperplasias, adenomas, and carcinomas; a cardiovascular disorder, sudi as arteriovenous fistula, 
atherosclerosis, hyperteosicm, vascuUtis, Raynaud's disease, aneurysms, arterial dissections, varicose 
vdns, thrombophlebitis and phlrfxjthrombosis, vascular tumors, and complications of thrombolysis, 
balloon angioplasty, vascular replaceniCTt, and COTonary artery bypass graft surgery, coi^estive heart 
failure, ischemic heart disease, angina pectoris, myocarcfial infarction, hypertensive heart disease, 
degeaierative valvular heart disease, calcific aortic valve steaiosis, cangemtaUy bicuspid acrtic valve, 
mitral annular calcificatiOT, mitral valve prolapse, rheumatic fever and rhaimatic heart disease, 
infective endocarditis, nonbacterial thrombotic endocarditis, endocarditis of systemic lupus 
erythematosus, caramad heart disease, cardiomyopathy, myocarditis, pericarditis, neoplastic heart 
disease, ccMigenital heart disease, and conqjlications of cardiac transplantation; an 
autoimmune rmflammatc ay disorder, such as acquired immunodeficiency syndrome (ABOS), Addison's 
disease, adult respiratcry distress syndrome, aflergies, ankylosing spondytitis, amyloidosis, anemia, 
asthma, atherosclerosis, atherosclerotic plaque rupture, autoimmune hanolytic anemia, autounranne 
thyroiditis, autoimnnmepolyaidocrinopalhyKandidiasis-ectoctermal dystrc^hy (APECED), Iwonchitis, 
cholecystitis, contact dermatitis, Crcto's disease, atopic dermatitis, dennatomyositis, diabetes mesflitus, 
enq)hysema, episodic lynphcpenia with lyn^jhocytotoxins, erythroblastosis fetalis, aythema nodosum, 
atrophic gastritis. gloinerulonq>hritis, Goodi>asture's syndrome, gout. Graves' disease. Hashimoto's 
thyroiditis, hypereosinc^hilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, 
myocardial or pericardial inflammati on, osteoarttiritis, degradation of articular cartilage, osteoporosis, 
pancreatitis, polymyositis, psoriasis, Reiter's syndrcHne, rheumatoid arthritis, scleroderma, Sjogren's 
syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic 
purpura, ulcerative colitis, uveitis, Werner syndrome, co^^)lications of cancer, hemodialysis, aiKi 
extracorpOTeal circulation, viral, bacterial, fungal, parasitic, protozoal, and hPiminfhtr infections, and 
trauma; a ceil proliferative disOTda- such as actinic keratosis, arteriosclerosis, atho-oscierosis, bursitis, 
cirrhosis, hepatitis, mixed connective tissue disease (MCTD), mydofibrosis, paroxysmal nocturnal 
hemoglobinuria, polycythania vera, psoriasis, primary thrombocythemia, and cancers including 
adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, ami, in 
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particular, cancers of the adrenal glaiKl, bladder, bone, bone noarrow, brain, breast, cervix, gaU bladder, 
gangUa, gastrointestinal tract, heart, kidi^. liver, lung, muscle, ovary, pancreas, parathyroid, poiis, 
prostate, salivary glands, skin, spleen, testis, tttymus, thyroid, and uterus; a developmental disorder, 
such as roial tubular acidosis, anemia. Cushing's syndrome, adiondroplastic dwarjasm. Ducheaane and 
Becker muscular dystrophy, bone resorption, epilepsy, gonadal dysgenesis. WAGR syndrome (Wilms' 
tumor, aniridia, genitourinary abnormalities, and niental retardation), Smith-Magenis syndrome, 
mydodysplastic syndrMne. hereditary mucoq)itbdial dysplasia, hereditary keratodermas, hereditary 
neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, 
hydrocephalus, seizure disOTders such as S yndenham's chorea and caebral palsy, spina bifida, 
anencephaly, cranioraciiischisis. congenital glaucoma, cataract, age-related macular degeneraticai, and 
sensorineural hearing loss; an epithelial disOTder, such as dyshidrotic eczema, anergic contact 
domatitis, keratosis pilaris, meiasma, vitiligo, actinic keratosis, basal ceil carcinoma, squamous ceU 
carcinoma, srfxrrheic keratosis, folUculitis, herpes simplejc, herpes zoster. variceOla, candidiasis, 
dermatophytosis. scabies, insect bites, cherry angioma, keloid, dermatofibroma, acrochcr(k>ns. 
urticaria, transient acantholytic dermatosis, xerosis, eczeana, atopic dermatitis, contact dermatitis, hand 
eczema, nummular eczema, lichen sin^l^ chronicus, asteatotic eczema, stasis dermatitis and stasis 
ulceration, seborrhac dermatitis, ps^iasis, lichen planus, pityriasis rosea, inipetigo, ecthyma, 
dermatophytosis. tinea versicolor, warts, acne vulgaris, acne rosacea, penq)higus vulgaris, peinphigus 
foliacwis, paraneoplastic pemphigus, bullous pemphigoid, herpes gestationis, dermatitis herp^cnnis, 
linear IgA disease, q»dermoIysis bullosa acquisita, dermatomyositis, lupus erythematosus, scleroderma 
and mOTphea, erytinroderma. alopecia, figurate skin lesions, telangiectasias, hypopigmentation, 
hypeqjigmentation, vesicles/buUae. exanfliems, cutaneous drug reactions, papulonodular skin lesions, 
chronic non-healing wounds, photosensitivity diseases, qndermolysis bullosa siii9>le]t. epidermolytic 
hyperkeratosis, epidermolytic and nonepidermolytic palmoplantar beratoderma, ichthyosis buDosa of 
Siemrais, ichthyosis erfoliativa, keratosis pahnaris et plantaris, keratosis palmoplantaris, palmoplantar 
keratodeima, keratosis punctata, Meesmann's corneal dystrophy, padiymychia congenita, white sponge 
nevus, steatocystoma multiplex, qpidermal nevi/q)idermolytic hyperkeratosis type, monfl^irix, 
trichothiodystrophy, dironic h^atrtis/cryptogenic cirrhosis, and colcyectal hyperplasia; a neurological 
disorder, such as epilepsy, ischemic cer^ovascular disease, stroke, ceretoral neoplasms, Alzheimer's 
<Mseas&, Pick's disease, Huntington's disease, dementia, Parkinson's cUsease and otiier extrapyramidal 
disorders, amyotrophic lateral sclerosis and other mota* neuron disOTders, progressive nau-al muscular 
atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demydinating diseases, 
bacterial and viral nraingitis. brain abscess, subdural en5>yema, epidural abscess, suppurative 
intracranial thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion 
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diseases including kuni, Creutzfeidt-Jakob disease, and Gerstmann^trausslex-Scheinka: syndrome, 
fatal familial insomnia, nutritional and m^boUc diseases of the nervous system, nOTOfibromatosis, 
tuberous sclerosis, cerebeflor^inal hemangioblastomatosis. eocephalotrigCTunal syndrome, mental 
retardation and other developmental disOTto of the central nervous system induding Down syndrome, 
cerebral palsy, nairoslaletal disorders, autOMmic nervous syston disorders, cranial nerve disorders, 
spinal cord diseases, muscular dystrophy and other nojromuscular disorders, perqjheral nervous systan 
discrders,dermatomyositis and polymyositis, inherited, m^boUc,ea^ 

myasthenia gravis, periodic paralysis, meaital diSOTders including mood, anxiety, and sciiizophrenic 
disoders, seasonal affective disca-cter (SAD), akathesia, amnesia, catatonia, cfiab^c nairopathy, tardive 
dyskinesia, dystonias, paranoid psy<±oses, postherpetic neuralgia, Tourette's discHtier. progressive 
sujH^uclear palsy, corticobasal degoieration, and famiHal frontotmipOTal dementia; and a reproductive 
disOTder. such as infertility, including tubal disease, ovulat^y defects, and mtom^osis, a disorder of 
prolactin production, a disruption of the estrous cyde, a disruption of the menstrual cyde, polycystic 
ovary syndrome, ovarian hypmtimulation syndrcone, an endometrial cr ovarian tumor, a uterine 
fibroid, autoimmune diSOTders, an ectopic pr^nancy, and teratogeaiesis; cancer of the breast, fibrocystic 
breast disease, and galactOThea; a disruption of spermatogenesis, abnormal sperm physiology, cancer 
of the testis, cancer of the prostate, baiign prostatic hyperplasia, prostatitis, Peyronie's disease, 
impot^ice, carcinoma of the male breast, and gynecomastia. The polynucleotide sequences encoding 
PRTS may be used in Southern OT northern analysis, dot blot, or ottier mwnbrane^based technologies; in 
PGR tedmolpgies; in dipstick, pin. and multifOTmat ELISA-like assays; and in microairays utilizing 
fluids or tissues &am patients to detect altered PRTS e3q>ression. Such qualitative quantitative 
methods are wdl known in the art 

In a particular aspect, the nucleotide sequences encoding PRTS may be useful in assays that 
detect the presence of associated disca-ders, particularly those mentioned above. The nucleotide 
sequences encoding PRTS may be labeled by standard m^hods and added to a fluid or tissue sample 
from a patient under coiKiitions suitable for the formation of bybridizatiOT coiiq>le3ccs. After a suitable 
incubation period, the sample is washed and the signal is quantified and (xmpared with a standard 
value. If the anK>unt of signal in the patiaitsan?>le is significanfly altered in ^ 
SBmple then the presaice of altered levels of nucleotide sequeaices eocomng PRTS in the sanq>le 
indicates the pres^ of the associated diswder. Such assays may also be used to evaluate tiie efficacy 
of a particular tiier^peutic treatDoent regimen in animal studies, in clinical trials, or to monitor the 
treatment of an individual patient 

In order to provide a basis for tiie diagnosis of a disorder associated with expression of PRTS, 
a normal or standard profile for expression is established. This may be accon^jlished by combining 
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bocty fluids or cdl extracts taken from iKwmal s dtto animal or human, wth a sequeixx^ 

fragmmt tbereof, encodiiig PRTS, under cMiditioiis suitable fac hybridization or aiiq)lification. 
Standard hybridization may be quantified by comparing the vahies obtained from Mwmal subjects with 
values from an experim^ in vMch a known anwunt of a substantiafly purified polynucleotide is used. 
5 Standard values obtained in this manner may be conq>ared with values obtained from sssnples from 

patiaus who are synq)tomatic frr a disorder. Deviation from standard values is used to establish the 
presoice of a disorder. 

Once the presence of a diSOTder is established and a treatment protocol is initiated, 
hybridization assays may be repeated on a r^ular basis to drtennine if the levea of ejqpression in the 
10 pafiem b^ins to approxiniate that wMch is observed in the normal subject The results obtained from 
successive assays may be used to show the efficacy of treatment over a period ranging from several 
days to months. 

With respect to cancer, the presence of an abnormal anoount of franscxlpt (either under- or 
overexpressed) in biopsied tissue fi^ an individual may indicate a predisposition for the development 
5 of tte disease, o-niay provide a means fcH- detecting the disease prior to the app^ 

clinical synq)tcMns. A more definitive diagnosis of this type may allow health professicmals to employ 
prevoitative measures or aggressive treatnttnt earUer thereby preventing the development or further 
progression of the cancer. 

Additicmal diagnostic uses for oligonucleotides designed from toe sequences encoding PRTS 
may involve the use of PGR. These oligomers may be chonicaUy synthesized, generated eaizymaficany. 
CH- produced in vitro. Oligomers wiU preferably contain a fragment of a polynucleotide encoding PRTS. 
OT a fragment of a polynuclec^de conplementary to the polynucleotide encoding PRTS, and win be 
eaiployed under optimized conditions fca- identificaticHi of a specific gene or conditicHi. OligouMrs may 
also be anployed under less stringent conditions fw detection or quantification of closeiy related DNA 
or RNA sequences. 

In a particular aspect, oligonucleotide primers derived from the polynucleotide sequences 
encoding PRTS may be used to detect single nucleotide polymoiphisms (SNPs). SNPs are 
substitutions, insertions and deletions that are a frequent cause of inherited or acquired genetic disease 
inhumans. Methods of SNGP detection indude, bm are not limited to, single^stra^ 
polymorphism (SSCP) and fluOTescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers 
derived from the polynucleotide sequmces encoding PRTS are used to amplify DNA using the 
polymerase chain reaction (PGR). The DNA may be derived, fcs: example, from diseased or ncxmal 
tissue, biopsy samples, bodily fluids, and the like. SNPs in the DNA cause difl^erences in the secondary 
and tmiary structures of PGR products in singl&.stranded form, and these differences are drtectable 
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using gel eaectrophoresis in non-daiaturing gds. In fSCCP. the oligonucleotide primers are 
fluorescCTifly labeled, which allows drtection of the amplimers in high-throughput equipmait such as 
DNA sequencing machines. Additi(Miany» sequence database analysis methods, tenned in silico SNP 
(isSNP), are enable of identifying pdymoiphisms by compariiig ^ sequence of individual 
overlapping DNA fragments which assemble into a common ccmseosus sequence. These computer- 
based m^hods filter out sequence variaticxis due to laboratory prq)aration of DNA and sequencing 
eCTOTs using statistical mod^ and automated analyses of DNA sequence chromatograms. Inthe 
alternative, SNPs may be doected and characterized by mass spectrtHnetry using, for example, tte high 
throughput MASSARRAY system (Sequenom, Inc., Saa Diego CA). 

Methods which may also be used to quantify the eaqiressiom of PRTS include radiolabeling or 
biotinylating nucleotides, coamplification of a control nucleic acid, and interpolating results from 
standard curves. (See, e.g., Melby, P.C. et aL (1993) J. Immunol Methods 159:235-244; Duplaa, C. et 
aL (1993) AnaL Biochoa 212:229-236.) The speed of quantitation of multiple sanples may be 
accelerated by running tte assay in a high-throughput fmnat where the oligomer or polynucleotide of 
interest is pressed in various dflutions and a spectrophotom^c or colorimetric response gives rapid 
quantitation. 

In furttier embodiments, oligonucleotides or longer fragments derived from any of the 
polynucleotide sequences described herean may be used as elements on a nrioroarray. The microarray 
can be used in transcript imaging techniques which monitor the rdative expression levels of large 
numbers ofgOTessinoBjltaneously as described below. Hie microarray may also be used to identify 
genetic variants, mutations, and polymcxphisms. This infOTmation may be used to (tetermine gene 
function, to understand the gen^c basis of a disorder, to diagnose a discrcter . to mcHiitor 
prpgression/t^essiCHi of disease as a function of goie expression, and to develop and monitor the 
activities of therapeutic agaits in the treatment of disease. In particular, this information may be used 
to develop a pharmacogenomic proffle of a patient in order to select the most ^propriate and effective 
treatment regimen for that patient. For example, ther^joitic agents whidi are highly effective and 
display the fewest side ^ects may be seiecXed for a patient based on his/her pharmacogenomic profile. 

In another embodiment, PRTS, fragments of PRTS, ot antibodies specific for PRTS may be 
used as elements cm a microarray. The microarray may be used to monitor or measure protean-protein 
interactions, drug-target mteractions, and gaie expression profiles, as described above. 

A particular embodiment relates to the use of the polynucleotides of the present invention to 
goierate a transcript image of a tissue ot cell type. A transcript image rqwesents the gl*al pattern of 
gene expression by a particular tissue cr cell type. Global gene expression patterns are analyzed by 
quantifying the number of expressed genes and their reiative abundance under giv^ conditions and at a 
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given time. (See Seflhamer et aL. **Coiiiparative Gene TYanscript Analysis," U.S. Patent Number 
5,840,484, e^essly incoipcx-ated by reference berdn.) Urns a transcript image may be generated by 
hyt^dizing the pdymicleotides of the present invention or tbeir coai5)lements to tiie totality of 
transcripts or reverse transcnpts of a particular tissue or ceil type. In one embodiment, the 
5 hybridization takes place in tai^throug]3put f onnat, v^iierein the polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. The 
resultant transcript image would provide a profile of gene activity. 

Transcript images may be generated using transcripts isolated from tissues, ceil Imes, biopsies, 
or other biological sanq>les. The transcript image may thus r^ect gene expression in vivo , as in tbe 
10 case of a tissue or biopsy sample> or in vitro, as in tiie case of a cell line. 

TransCTipt images ^ch profile the expression of the polynucleotides of the present invention 
may also be used in cdnjuiKrdon with in vitro modd systmis and preclinical evaluaticxi of 
pharmaceuticals, as wdl as toxicological testing of industrial and naturaUy-occurring envu-oimKaital 
conqxHmds. All coii9X>unds induce characteristic gene expression patterns, frequently tmned 
15 molecular fingerprints or toxicant signatures, whicdi are indicative of nvyhanigmc of action and toxicity 
(Nuwaysir, E.F. a al. (1999) Mol. Carcinpg. 24:153-159; Staner. S, and NX. Anderson (2000) 
Toxicol. Lett 112-1 13:467-471, expressly inavporated by reference herein). If a test compound has a 
signature simflar to that of a compound with known toxicity, it is likeiy to share those toxic prcperties. 
These fingerprints or signatures are most useful and refined v^iien th^ contain expression information 
from a large number of genes and gaie families. IdeaDy, a genome- wide measurement of expression 
provides the hig^t quality signature. Even genes whose expression is not altered by any tested 
confounds are inqx^rtant as well, as the levels of expression of these genes are used to r»xinalize the 
rest of the expression data. Tbe ncnnalization procedure is useiul for con^parison of expressi^^ 
after treatment with differoit compounds. Whfle the assignment of gene fimction to elements of a 
toxicant signature aids in interpretation of toxicity mechanisms, knowledge of gene function is not 
necessary for the statistical matching of signatures which leads to prediction of toxicity. (See, for 
example. Press Release 00-02 from the National Institute of EnvirOTmental Healdi Sciences, released 
February 29, 2000, available at httpy/www.mehs.mh.gov/oc/news/toxchipJitin.) Therefore, it is 
impcmnt and desirable in toxicological screening using toxicant signatures to include all expressed 
gene sequences. 

In one embodiment, the toxicity of a test compound is assessed by treating a biological sanq)le 
containing nucleic acids with the test compoimd. Nucleic acids that are expressed in the treated 
biological sample are hybridized with one or more probes specific to the polynucleotides of the 
present invention, so that transcript levels corresponding to the polynucleotides of the present 
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invention may be quantified. Hie transcript levels in the treated biological sample are compared with 
levels in an untreated biological sanq)le. Diffidences in the transcript levels baween the two samples 
are indicative of a toxic response caused by the test compound in the treated sample. 

Another particular embodin^ relates to tt^ use of the poIypq}tide sequences of the present 
invention to analyze the jK^oteome of a tissue <H* ceJl type. The term proteome refes to the global 
pattern of protein expression in a particular tissue orcein type. Each protein conqxjnent of a proteome 
can be subjected individuaUy to further analysis. ProtecHne expression patterns, or iroffles, are 
analyzed by quantifying the number of expressed protons and their relative abundance under given 
conditions and at a given time; Aproffle of a ceD's proteome may thus be generated by sq>arating and 
analyzmg the polypq)tides of a particular tissue or cell type. In one embodiment, the sq)aration is 
achieved using two-dimensional gel dectrophoresis, in whidi proteins from a sanq>le are s^arated by 
isoelectric focusing in the first dim^ision, and then according to molecular weight by sodium dodecyl 
sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, supra^ . Tbe proteins are 
visualized in the gel as discrete and uniquely positioned spots, typically by staining the gel with an agent 
such as Coomassie Blue cr saver OT fluorescent stains. The optical density of each protein spot is 
g^ierally prcpordonal to the level of the protean in the sample; The optical densities of equivalently 
positioned prot^ spots fi^om diflferoit san^les, for example, from biological samples either treated or 
untreated with a test compound ot ttoapeutic ageaat, are conq)ared to identify any changes in protein 
spot density related to the treatme^ The proteins in the spots are partially sequenced iising, for 
example, standard methods employmg chemical or enzymatic cleavage followed by mass spectrom«ry . 
The identity of the protein in a spot may be determined by conq>aring its partial sequence, preferably of 
at least 5 contiguous amino acid residues, to the polypeptide sequences cf the presea^ In 
soine cases, further sequence data inay be obtained fOT definitive protein identification 

A proteomic profile may also be generated using antibocfies specific for PRTS to quantify the 
levels of PRTS expression In me embodiment, the antibodies arc used as elements on a microarray, 
and protein expression levels are quantified by exposing the microairay to the sample and detecting the 
levels of protein bound to each array eiemCTt (Luelang, A. et ai. (1999) Anal. Biochean. 270:103-1 11; 
Meodoze, L.G. et al. (1999) Biotechniques 27:778-788). D^ection may be performed by a vari^ of 
methods known in the art, for example, by reacting the protons in the san5>le with a thiol- ot amino- 
reactive fluorescoit con^x)und and detecting the amount of fluoarescence bound at each array element 

Toxicant signatures at the proteome level are also useful far lexicological screening, and should 
be analyzed in parallel with toxicant signatures at the transcript level There is a poor correlation 
between transcript and protein abundances for some proteins in some tissues (Anderson, N.L. and J. 
Solhamer (1997) Electrophoresis 18:533-537). so proteome toxicant signatures may be useful in the 
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analysis of compounds wMch do not significantly affect tlie transoipt image, but wiiich alter the 

proteomic profile. In addition, the analysis of transaipts in body fluids is diffic^ 

degradation of mRNA, so proteomic profiling may be mo-e rdiable and informative in such cases. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sa^^)le containing proteins with the test con^xxmd. Protdns that are expressed in the treated biological 
saniple are sq)arated so that tibe amount of each protein can be quantified The amount of each protein 
is compared to the anH>um of the corresponding protean in an untreated bio^ Adifierence 
in the amount of proton b^eem the two s aii5)les is iiKlicati ve of a toxic response to the test compound 
in the treated sample. Individual j^oteins are identified by sequencing the amino acid resi^^ 
individual proteins and comparing these partial sequences to the polypeptides of the preseait invention. 

In another embodiment, the toxicity of a test conqxjund is assessed by treating a biological 
sample containing f^oteins with the test compound. Proteins frcan the biological sample are incubated 
with antibodies specific to the polypeptides of the iHreseaitinvaition. Theamount of protein recognized 
by the antibodies is quantified. The anK)unt of protein in the treated biological sample is cQn5>aredw^ 
the amount in an untreated biological sanq)le. A difference in tiie amount of protein b^een the two 
san^les is indicative of a toxic respcHise to the test con^xxmd in the treated sample 

Nficroarraysmay be pr^ared, used, and analyzed using niethodsJaiDwn in the a^ (See, e.g., 
Bremnan, T.M. ^ al. (1995) U.S. Patau No. 5,474,796; Schena, M. et aL (1996) Proc. NatL Acad. Sci. 
USA 93:10614-10619; Baldeschweaer et al. (1995) PCT application W095/25 11 16; Shalon, D. et al. 
(1995) PCT application WO95/35505; H^er, R.A et aL (1997) Proc. NatL Acad. ScL USA 94:2150- 
2155; and Heller. M.J. et al. (1997) U.S. Pat^ No. 5,605,662.) Various types of microarrays are weQ 
known and thwoughly described in DNA Microarravs: A Practical Aporoa^^ M. Schema, ed. (1999) 
Oxford University Press, London, hereby ex|H^essly incorporated by reference. 

In another embodirncmt of jthe invaition, nudeac acid sequences eocodi^ may be used to 
gea»ate hybridization probes useful in mapping the naturany occurring genon^^ Either 
coding or noncoding sequences may be used, and in some mstances. noncoding sequences may be 
preferable over coding sequences. For example, conservation of a coding sequence among members 
of a multi-gene family may potentiaUy cause undesired cross hybridization during chromosomal 
mapping. The sequ^es may be mapped to a particular chromos<me, to a specific r^on of a 
chromosome, or to artificial chromosome constructions, ag., human artificial chromosomes (HACs). 
yeast artificial chromosonffis (YACs), bacterial artificial chromoscnnes (BACs), bacterial PI 
constructions, single chromosome cDNA libraries. (See. e.g.. Harrington, J.J. a al. (1997) Nat 
Genet 15:345-355; Price, CM, (1993) Blood Rev. 7:127-134; and Trask, B.J. (1991) Trends Genet 
7:149-154.) Once mapped, the nucleac add sequences of the invention may be used to dev^cp genetic 
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linkage maps, for example, which ccHTdate the inheritance of a disease state with the inheritance of a 
particular chromosome region or restriction fragment length polymorphism (RFLP). (See, for 
exan^le, Land&c, E.S. and D. Botstein (1986) Proc. Natl. Acad. Sci. USA 83:7353-7357.) 

FluOTescent in situ hybridization (FISH) may be coarrelated with other physical and geastic map 
data. (See, e.g., Heinz-Ulridi. et al. (1995) in Meyers, supra , pp. 965-968.) Examples of gGoe^c map 
data can be found in various scientific jounials or at tile Online Men^^ 

World Wide Web site. Corr^tion between the location of the geaae eaicoding PRTS on a physical map 
and a specific disorder, or a predisposition to a specific disorder, may ixip define thereon of DNA 
associated with that disorder and thus may further positional dosiing everts. 

Insto hybridization of chromosomal preparations and physical mapping techniques, sudi as 
linkage analysis using established chromosomal markers, may be used for extending gen^c maps. 
Oitea tiie placement of a goie on tiie chromosome of another mammalian species, such as mouse, may 
reveal assodated nwkers ev€» if the exact chromosomal locus is not known. This information is 
valuable to investigators searching for disease genes using positional dcxdng or other gene discovery 
tecdiniques. Once the gaie or gooes responsible for a disease syndtome have been crad^ 
by genrtic linkage to a particular genomic region, ag., ataxia-telangiectasia to 1 lq22-23, any sequeaices 
mapping to that area may represent associated or regulatcwy genes for fiuther investigatioa (See, e.g., 
Gatti, R.A. ^ aL (1988) Nature 336:577-580.) The nucleotide sequence of tiie instant invention may 
also be used to drtect differ^ices in the chromosomal location due to translocation, inversion, etc., 
among normal, carrier, or affected individuals. 

In another embodiment of the inv^on, PRTS, its catalytic or immunogenic fragments, or 
oligopq>tides thereof can be used for screening hljraries of con^xjunds in any of a vari^ of drug 
screening tecimiques. The fi-agment employed in such screening may be free in solution, affixed to a 
solid suppOTt, bane on a cefl surface, ot located intracdttularly. The formation of binding con^lexes 
between PRTS and the agent being tested may be n^asured. 

Another technique for drug screemii^ provides f ot high througlput screening of compounds 
having suitable binding affinity to the protein of interest. (See, e.g., G^en, et al. (1984) PCT 
application WO84/03564.) In tins mettKxl, large numbers of different small test coaiqpounds are 
symhesized on a solid substi^ale. The test coai^pounds are reacted witiiPRTS, fi^agments thereof, and 
wasted. Bound PRTS is then detected by metiKxls wdl known in the art Purified PRTS can also be 
coated directly <Mito plates fOT use in the aformoitioned drug screemng techniques. Alternatively, 
nonriKutraiizing antibodies can be used to capture the pqptide and immobilize it on a solid suppcHt. 

In ancAher embodiment, one may use conq)etitive drug screening assays in which neutralizing 
antibodies capable of binding PRTS specifically con5)^ witii a test conqxMmd for binding PRTS. In 
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this manner, antibodies can be used to detect the presence of any peptide which shares one cr more 
antigenic detenninants with PRTS. 

In addiUcnal embodiraeaits, the nucleotide sequences whicb encode PRTS may be used in any 
molecular biology techniques that have yet to be developed, provided the new techniques rely on 
5 pr(^)erties of nucleotide sequoaces that are curr^y known, including, but not limited to, such 
pr(^)erties as tt^ triplet genetic code and specific base pair interactions. 

Without further elaboration, it is believed that one sMled in the art can, using the preceding 
description, utilize the present invention to its fullest extent The foUowing embodiments are. 
therefore, to be construed as merely illustrative, and not limitative of the remainder of the disclosure 
10 in any way whatsoever. 

Hie disclosures of all patents, ^Ucations, and publications mentioned above and below, 
including U.S. Set. No. 60/212,336, U.S. Ser. No. 60^13.995, U.S. Ser. No. 60/215,396, U.S. Ser. 
No. 60/216,821, and U.S. Ser. No. 60/218,946, are h^eby eiqjressly incorporated by reference. 



^5 EXAMPLES 
I. Construction of cDNA Libraries 

Incyte cDNAs were derived from cDNA Hhraries described in the LIFESEQ GOLD database 
(Incyte Genomics, Palo Alto CA) and shown in Table 4, column 5. ScMne tissues were homogenized 
and lysed in guamdinium isothiocyanate. while others wore homogwiized and lysed in phenol cr in a 

20 suitable mixture of denaturants, such as TRIZOL (Lite Technologies), a monophasic solution of phenol 
and guanidine isothiocyanate. The resulting lysates were centrifuged over CsO cushiCHis or eirtracted 
withcWOTofcnn. IWA was precqjitated from the lysates with eatber isopropanol cr so^ 
ethanol, or by other routine methods. 

Phenol extraction and precipitation of RNA were lepested as necessary to increase RNA 
25 purity. In some cases, RNA was treated witiiDNase. For most Ubraries, p61y(A)+ RNA was isolated 
using oUgo d(T)-coupled paramagnetic particles (Prom^a), OLIGOTEX latex particles (QL\GEN. 
ChatswwthCA), a- an OLIGOTEXniRNA purification lot (QIAGEN). Alternatively, RNA was 
isolated direcOy from tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA 
purification kit (Amibion, Austin TX). 

In some cases, Stratagene was provided with RNA and ccHistructed the ccsresponding cDNA 
h-hraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP 
vector systan (Stratagene) oc SUPERSCRIPT plasraid system (Life Technologies), using the 
reconmiended procedures (»■ sirnilar methods known in the art (See, ag., Ausubei, 1997, supra , units 
5.1-6.6.) Reverse transcription was initiated using oligo d(T) random primers. Synthetic 
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Oligonucleotide adapters were ligated to double stranded cDNA, and the cDN A was digested with the 
appropriate restriction enzyme or earymes. For most libraries, the cDNA was size-sdected (300-1000 
bp) using SEPHACRYL SIOOO, SEPHAROSE CL2B, cr SEPHAROSE CL4B column 
chromatography (Amersham Pharmacia Biotedi) or preparative agarose gd electrophoresis. cDNAs 
5 were Ugated into compatible restriction mzyme sites of the polylinto e.g., 
PBLUESCRIPT plasmid (Stratagoie). PSPORTl plasmid (Life Technologies), PCDNA2.1 plasmid 
(Invifrogeai, Carlsbad CA), PBK-CMV plasmid (Stralagene), or pINCY (Incyte GentMnics, Palo Alto 
CA), or derivatives thereof. Recombinant plasmids were transformed into competent E. coli cdls 
including XLl-Blue. XLl-BlueMRF. ot SOLR from Stratagoie or DHScx, DHIOB, or ElectroMAX 
10 DHIOB from Life Technologies, 
n. Isolation of cDNA Oones 

Plasmids obtained as described in Example I were recovered frcHn host cells by in vivo excision 
using the UNEAP vector system (Stratagene) or by cell lysis. Plasmids were purified using at least 
one of the following: a Magic w WIZARD Minipreps DNA purificaticm systmi (Promega); an AGTC 
15 NDniprq> purification kit (Edge Biosystems, Gaithersburg MD); and QIAWELL 8 Plasmid, QIAWELL 
8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.E JVJL. PREP 96 plasmid 
purification kit from QIAGEN, Following precipitation, plasmids were resuspexided in 0. 1 ml of 
distilled water and stored, with or without lyophilization, at 4'C. 

Alternatively, plasmid DNA was amplified frcmbost cell lysates using direct link PGR in a 
Mgjirthrou^ut fmnat (Rao, V.B. (1994) AnaL Biochan. 216:1-14). Host c^ lysis and thermal 
cycling stq>s were carried out in a single reaction mixture. Samples were [H-ocessed and stored in 384- 
wdl plates, and the concentration of amplified plasmid DNA was quantified fiuorometrically using 
PICOGREEN dye (Moilecular Probes, Eugene OR) and a FLUOROSKAN n fluorescence scanner 
(Labsystems Oy, Helsinki, FihlancO. 
HI. Sequencing and Analysis 

Incyte cDNA recovered in plasmids as described in Example n were sequenced as follows. 
Sequencing reactions were processed using standard methods or high-througlqwt instrumentation 
sach as the ABI ClATALYST 800 (Applied Biosystems) tiiemial cyder or the PTC-200 thermal cycler 
(MJ Research) in conjunction with the HYDRA microdispensex (Robbins Scientific) d the 
MICROLAB 22(X) (HamUton) liquid transfer system. cDNA sequencing reactions were prepared 
using reagents provided by Amersham Pharmacia Biotech or supplied in ABI sequencing kits such as 
the ABI PRISM BIGDYE Terminator cycle sequaicing ready reaction kit (Applied Biosystems). 
Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides were 
carried out using the MEGABACE 1000 DNA sequencing system (Molecular Dynamics); the ABI 
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PRISM 373 or 377 sequeamng system (AppHed Biosystems) in conjunctiaii with standard ABI 
prt>tocoas and base calling software; or otber se^ Reading 
frames within the cDNA seqaaices were identified using standard methods (reviewed in Ausubd, 1997. 

suE[a, unit 7.7). Someof the cDNAsequeiK» were selected for exteaision using tiietec^ 
5 disclosed in ExanqjleVni. 

The polynucleotide sequences derived from Incyte cDNAs were validated by removing vector, 
linker, and poly(A) sequences and by masking ambiguous bases, using algorithms and pn^rams based 
onBLAST. dynamic programming, and dinudeotide nearest neighbor analysis. The hicyte cDN A 
sequmces ot translations thereof were thai queried against a selection of public databases sudi as the 
10 GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and BLOCKS, PRINTS, 
DOMO. PRODOM, and hidden Markov model (HMM)-based protean family databases sudi as PFAM. 
(HMM is a probabilistic approach which analyzes consensus primary structures of gcaie families. 
See, for example, Eddy, S.R. (1996) Curr. Opin. Struct Biol. 6:361-365.) The queries were 
perlOTned using programs based on BLAST, FASTA, BLIMPS, and HMMER. Hie hicyte cDNA 
15 sequences were assembled to produce ftm length polynucleotide sequeaices. Altemativdy, G^ank 
cDNAs, GenBank ESTs, stitched sequences, strrtched sequences, ot Genscan-predicted cocfing 
sequences (see Examples IV and V) were used to extend Incyte cDNA assemblages to liill length. 
Assembly was performed using prc^ams based on Phred, Phrap, and Consed, and cDNA assemblages 
were scremed fcff opeai reading frames usmg programs based on GeaieMark, BLAST, andFASTA 
20 Tlie fill! loigth polynucleotide sequcmces were translated to derive the corresponding fufl length 

polypq}tide sequoices. Altemativ^y, a polypq}tide of the invoition may begin at any of the m^onine 
residues of the fufl length translated polypqptide. FuU length polypqptide sequences were subsequenJiy 
analyzed by querying against databases such as the GenBank prot^ databases (genpept), SwissProt, 
BLOCKS, PRINTS, DOMO, PRODOM, Prosite, and hidden Markov motfca (HMVO-based protein 
famay databases such as PFAM. Fun length polynucleotide sequences are also analyzed using 
MACDNASIS PRO software (Hitachi Software Engineering. South San Francisco CA) and 
LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence alignments are 
generated usmg default parameters specified by the CLUSTAL algorithm as incofporated mto the 
MEGALIGN multisequmce alignment program (DNASTAR), which also calculates the percait 
identity between aligned sequences. 

Table 7 smmnarizes the tools, programs, and algcxithms used fw the analysis and assonbly of 
Incyte cDNA and full length sequences and provides appUcable descriptions, references, ami teeshold 
parameters. The first column ofTable 7 shows the tools, programs, and algwithmsus^ 
column provides brief descriptions thereof, the third column preseaUs ^propriate references. aU of 
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wbich are incorpOTated by reference herein in tlKar entirety, and the fourth cdluinn pesents. w*ere 
applicable, the scores, jM-obabiUty values, and other paramaexs used to evaluate the strength of a match 
between two sequences (the hi^ the score or the lower the probability value, the greata- the identity 
between two sequences). 

5 The programs described above fa the assembly and analysis of fun length polynncleotide and 

polypeptide sequences were also used to identify polynucleotide sequence ftagments from SEQ ED 
N0:22-42. Rragmoits from about 20 to about 4000 nucleotides w*ich are useful in hybridizatioai and 
amplification technologies are described in Table 4, column 4. 
rv. IdentiGcation and Editing of Coding Sequences from Genomic DNA 

3 Putative proteases were initially i(tentified by running the Geoscan geoe identificaticBi program 

against public genomic sequence databases (e.g.. gbpri and gbhtg). Genscan is a general-purpose gene 
identification program which analyzes genomic DNA sequences from a variety of organisms (See 
Burge, C. and S. Karlin (1997) J. Mol. Biol. 268:78-94. and Burge. C. and S. Karlin (1998) Curr. 
Opin. Struct. Biol. 8:346-354). Hie program concatenates predicted exons to fonn an assembled 
cDNA sequence extending fitan a methionine to a stop codon. HieouQHaof GenscanisaFASTA 
database of polynucleotide and polypeptide sequences. Themaximumrangeof sequence for Genscan 
to analyze at once was set to 30 kb. To determine which of these Genscan predicted cDNA sequences 
encode proteases, tiie encoded polypqjtides were analyzed by querying against PFAM modds for 
proteases. Potential proteases were also identified by homology to Incyte cDNA sequences that had 
been annotated as iwoteases. These selected Genscan-predicted sequences were then cainpaiGA by 
BLAST analysis to the genpq>t and gbpri public databases. Where necessary, die Genscan-predicted 

sequences wwe then edited by comparison to the top BLAST hit from genpqa to correct CCTors in flie 
sequencepredicted by Genscan. sudi as extra OT omitted exons. BLAST analysis was also used to find 
any Incyte cDNA or public cDNA coverage of the Genscan-predictBd sequences, thus providing 
evidence for transcrqjtioa When Incyte cDNA coverage was available, this information was used to 
correct or otMfirm the Genscan predicted sequaice. FuH lengOi polynucleotide sequences were c*talned 
by assembling Genscan-predicted coding sequences with Incyte cDNA sequences and/or public cDNA 
sequences using flie assembly process (tescribed in EMn5)le m. Altonativdy. full lengtti 
polynucleotide sequences were derived emtirdy from edited cx unedited Genscai^predicted coding 
sequences. 

V. Assonbly of Genomic Sequence Data with cDNA Sequence Data 
"Stitched" Sequences 

Partial cDNA sequences were extMided with exons predicted by tht Genscan gene identification 
program described in Example IV. Partial cDNAs assembled as described inExample HI were mapped 
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to gammic DN A and parsed into dusters containing related cDN As and Gaiscan exon predictions from 
one or more genomic sequences. Each cluster was analyzed using an algorithm based on graph theory 
and dynamic prc^amming to integrate cDNA and genomic infmnation, gaierating possible splice 
variants that were subsequoitly confirmed, edited, or ext^ided to create a full l^igth sequence. 
S equence intervals in which the entire length of the interval was present on more than one sequence in 
the cluster were identified, and intervals thus identified wo-e considered to be equivalent by transitivity. 
For exaniple^ if an interval was present on a cDNA and two genomic sequences, then all three intervals 
were considered to be equivalent This process allows unrelated but consecutive genomic sequences to 
be brought together, bridged by cDNA sequence. Intervals thus identified were thai "stitched" together 
by the stitching algorithm in the order that they appear along their parent sequences to generate the 
longest possible sequence, as well as sequooce variants. Linlcages between intervals which proceed 
along (me type of parent sequence (cDNA to cDNA or genomic sequeaice to gOKxnic sequence) were 
given preference over linkages wMch change parent type (cDNA to genomic sequence). The resultant 
stitched sequences were translated and compared by BLAST analysis to the genpept and gbpri public 
databases. Incoirectexons predicted by Genscan were ccn-ected by coaq)arison to the top BLAST 

fromgenpe^Jt Sequences were fiuther extended with additional cDNA sequences, cr by inspection of 
genomic DNA, when necessary. 

^*Stretched" Sequences 

Partial DNA sequences were extended to full length with an algontfam based on BLAST 
analysis. Hrst, partial cDNAs assembled as descaibed in £xan[q>le DI were queried agaii^ public 
databases such as the GenBank primate, rodent, maTnTnaiian^ vert^ate, and eukaryote databases using 
the BLAST program. The i^arest GenBank protein homolog was then conpared by BLAST analysis 
to atherlncytecDNA sequences or GeoScan exon predicted sequeiK^es described in Exan^ A 
chimeric prolan was generated by using the resultant high-scoring segment pairs (HSPs) to m^ the 
translated sequ^ices onto the GenBank protein homolog. Insertions or ddetions may occur in the 
chimericprotM with respect to the original GenBank protein lM>^ The GenBank protein homoJog. 
the diimeric protein, or both were used as probes to seardi for homc^ogous genomic sequences firom the 
public human genome databases. Partial DNA sequences were therefore "str^ched" cr exXeoded by the 
addition of homologous goiQmic sequences. The resultant stretched sequences were examined to 
determine whether it contained a complete gene. 

VI. Chromosomal Mapping of PRTS Encoding Polynucleotides 

The sequences which were used to assemble SEQ ID NO:22-42 were compared with 
sequences from the Incyte LIFESEQ database and public domain databases using BLAST and other 
implementations of the Smith- Waterman algorithm. Sequences from these databases that matched 
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SEQ ID NO:22-42 woe assembled into clusters of contiguous and overlapping sequences using 
assembly algorithms such as Phrap (Table 7). Radiation hybrid and genetic mapping data available 
from public resources such as the Stanford Human Genome Center (SHGQ. Whitehead Institute for 
Genome Research (WIGR). and G6n6thon were used to determine if any of the clustered sequences 
5 had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in the assignment 
of aU sequences of that cluster, including its particular SEQ ID NO:, to ttiat m^ location. 

locations are re^xesented by ranges, or intervals, of human cHromosomes. The map 
position of an interval, in CMitiMOTgans. is measured relative to the teiminus of the chromosome's p- 
arm. (The centiMorgan (cM) is a unit of measurement based on recombination fiequendes between 
10 diromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 
humans, although this can vary widely due to hot and cold spots of recomhinatton.) The cM 
distances are based cm graietic markers m^jped by G6nfithon which provide boundaries for radiation 
hybrid mariffirs whose sequences were included in each of the clusters. Human genome maps and 
other resources available to the public, sudi as the NCBI "GeneMap*99" World Wide Web site 
15 (http://www.ncbLnlmjnh.gov/genMnapO. can be enqjloyed to detatmine if previously identified 
disease genes map within or in proximity to the intervals indicated above. 

In this mann«, SEQ ID NO:25 was mapped to chromosome 5 within the mterval from 69.60 
to 76.50 centiMorgans. SEQ ID NO:28 was mapped to chromosome 16 within the interval from 
81.80 to 84.40 centiMorgans. 
20 VII. Analysis of Polynacleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a gene 
and involves the hybridization of a labeled nucleotide sequence to a membrabe on which RNAs from a 
particular cefl type or tissue have been bound. (See, e.g., Sambrook. snnra. ch. 7; Ansnbel (1995) 
supra, ch. 4 and 16.) 

25 Analogous computer techniques applying BIAST were used to search to identical or related 

molecules in cDNA databases such as GenBank or UFESEQ (Incyte Genomics). TWs analysis is 

nmch faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer 
search can be modified to d^ennine whether any particular match is categorized as exact or similar. 
Hie basis of the search is the product score, which is defined as: 



30 



BLAST Score x Percent Identity 

5 X minimum {laigth(Seq. 1), length(Seq. 2)} 



The product score takes into account both the d^ee of similartty betweoi two sequences and the length 
of the sequence match. The product scc»e is a ncamalized value between 0 and 100. and is calculated 
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as follows: tbe BIJVST score is inultq>]ied by the percent oucleotide ide^ 
by (5 times tbe length of tbe shorter of the two sequences). The BLAST scGre is calculated by 
assigning a score of +5 for every base that matches in a high-scoring s^ment pair (HSP), and -4 for 
every mismatch. Two sequeaices may share mcrc than one HSP (separated by gaps). If there is more 
than one HSP, then the pair with the WghestBlJ\STsc(TC is used to calculate the prod^ The 
produa score rejM-esents a balance between fractional overlap and quality i^ For 
exan^le, a productscoreof 1(X) is produced only for 100% identity over the entire ksgth of ^ 
of the two sequences being conq)ared. Aproductsoa-eof 70 is produced either by 100% identity and 
70% overlap at one end, or by 88% identity and 100% overlap at the other. A product score of 50 is 
produced either by 100% identity and 50% overlap at cme end, or 79% idoitity and 100% overlap. 

Altemati veiy, polymicleodde sequences encoding PRTS are analyzed witii respect to the tissue 
sources from ^^ch they were derived. Fch- exan^>le, some fun length sequences are assembled, at least 
in part, with overlapping IiKrytecDNAsequaices (see Example ni). £ac^ cDN A sequence is derived 
from a cDNA library constructed from a human tissue. Each human tissue is classified into one of the 
following organAtissue categories: cardiovascular syst^ connective tissue; digestive system; 
embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ cells; 
hemic and immune system; liver; musculoskel^al systm; nervous system; pancreas; respiratory 
system; sense organs; skin; stomatc^nathic system; unclassified/mixed; or urinary tract The number of 
libraries in each cat^ory is counted and divided by the total number of libraries across all categories. 
Similarly, each human tissue is classified into oi^ of the following disease^condition categories: cancsx, 
cdl line, developmental, inflammation, neurological, trauma, cardiovascular, pooled, and other, aiKi the 
number of libraries in each cat^ory is counted and divided by the total number of libraries across aH 
categories. The resulting percentages reflect the tissue- ai^ disease-specific expression of cDNA 
encoding PRTS. cDNA sequcaooes and cI>NA Ubrary/tissue infcnnation are found in the 
GOLD database (Incyte Gaxnnics, Palo Alto CA). 
Vin. Extension of PRTS Encoding Polynucleotides 

Full length polynucleotide sequences were also produced by extension of an apprc^iriate 
fragmoit of the fun length xnolecule using ohgonudeotideprinier^ One 
primer was synthesized to initiate 5' extension of the known fragment, and the other prima: was 
synthesized to initiate 3' extension of the known fragment The initial primers were designed using 
OLIGO 4.06 software (Natic»al Biosciences), or another ^ropriate program, to be about 22 to 30 
nucleoddes in length, to have a GC content of about 50% or more, and to anneal to the target sequence 
at teii^)eratures of about 68 ''C to about 72*C. Any stretch of nucleotides which would result in hairpin 
structures and primer-primer dimerizations was avoided. 
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SdectedfauiimcDNAlibranes were used to exteiidtte If mere than one extexision 

was necessary or desired, additional or nested sets of primers were designed. 

High fideOty amplification was obtaimd by PGR using m^todswefl PCR 
was performed in 96-wdl plates using the PTC-200 thermal cycler (MJ Research, Inc.). The reaction 
mix contained DNA template. 200 nmol of each primer, reaction buffer containing Mg^\ Qm^y^O^, 
and 2-merc^>toethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASE enzyme 
(Life Technologies), and Pfu DNA polymerase (Stratagene)r with the following param^ers for jximer 
pair PCI A and PCI B: Step 1: 94**C, 3 min; Step 2; 94'*C, 15 sec; Step 3: 60**C, 1 min; Stqp 4: 68 °C, 
2 min; Stq) 5: Stq)s 2. 3, and 4 repeated 20 times; Step 6: 68'C, 5 nrin; Step 7: storage at 4'C. In the 
altemative. the paramrters for primer pair T7 and SK+ were as follows: Step 1: 94**C. 3 miti^ step 2: 
94"C, 15 sec; Stq>3:57**C, 1 min; Step 4: 68 'C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; 
Step 6: 68 "C. 5 min; Step 7: stcyage at 4**C. 

The concentration of DNA in each weOl was detennined by cfispensing 1 00 ^1 PIC O GREEN 
quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Probes, Eugene OR) dissolved in IX TE 
and 0.5 ^1 of undiluted PCR product into each well of an opaque fiuorimeter plate (Coming Costar, 
ActonMA), allowing the DNA to bind to the reageaiL The plate was scanned in a Fhioroskan II 
(Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 
OHJcentration of DNA A 5 //I to 10 /il aliquot of the reaction mixture was analyzed by dectrophoresis 
cm a 1 % agarose gel to determine which reactions were successful in extending the sequence. 

The extended nucleotides were desalted and concentrated, transferred to 384-well plates, 
digested with CviJI cholera virus endMuclease (Molecular Biology Researdi, Madison WI), and 
sonicated sheared priOT to rogation into pUC 18 vectCM* (Amersham Pharmacia Biotech). For 
shotgun sequencing, the digested nucleotides were separated on low concentration (0.6 to 0.8%) agarose 
gels, fragments were excised, and agar digested with Agar ACE (Promega). Extended clones were 
reaigated using T4 Hgase (New England Biolabs, Beverly MA) into pUC 18 vector (Amersham 
Wiarmacia Biotech), treated with Pfti DNA polymerase (Stratagene) to fill-in restriction site overhangs, 
and transfected into conqjetoitRcolicdls. TransfonMd cdls were selected on antibiotic-containing 
media, and individual colonies were picked and cultured overnight at 37**C in 384-weIl plates in LB/2x 
carb liquid media. 

The ceBs were lysed. and DNA was anq)lified by PCR using Taq DNA polymerase (Amersham 
Pharmacia Biotech) and Pfii DNA polymerase (Stratageae) with the following param^ers: Step 1: 
94»C. 3 min; Step 2: 94°C. 15 sec; St^ 3: 60''C, 1 min; Stq) 4: 72 ^'C, 2 min; Step 5: steps 2. 3. and 4 
repeated 29 times; Step 6: 72'*C. 5 min; Stq> 7: stwage at 4*»C. DNA was quantified by PICOGREEN 
reagent (Molecular Probes) as described above. San^jles with low DNA recoveries were reamplified 
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using tbe same conditioos as described above. Sanqjles were diluted witb 20% dimethysulfbxide (1 :2, 
v/v), and sequenced using DYEN AMIC energy transfer sequencing pnnoers and the DYENAMIC 
DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminatcr cycle 
sequencing ready reaction idt (.^>plied Biosystems). 

In like manner, fun length polynucleotide sequences are verified using the above procedure or 
are used to obtain 5 * regulatory sequences using the above procedure along with oligonucleotides 
designed for such ext^ion, and an ^pro|^ate genomic library. 
IX« Labeliog and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO:22-42 are aaployed to screen cDNAs, genomic 
DNAs, or mRNAs. Alttiough the labeling of oligonucleotides* consisting of alKXit 20 base pairs, is 
specifically descritted, essentially the same procedure is used with larger nucleotide fragments. 
Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 software (National 
Biosciences) and labeled by combining 50 pmol of eadi ohgomer, 250 /^Ci of [y-^^P] afip^vynnfi 
triphosphate (AuKa^ham Pharmacia Biotech), and T4 polynucleotide kinase (DuPont NEN, Boston 
MA). The labded oligcvnideotides are substantially purified using a SEPHADEX G-25 superfii^ size 
exclusion dexfran bead column (Amersham Phannacia Biotech). An aliquot containing 1 0^ counts per 
minute of the lab^ed probe is used in a typical membrane-based hybridization analysis of human 
goromic DNA digested with one of the following endonucleases: Ase I, Bgl II. Eco RI, Pst I, Xba I, or 
Pvu n (DuPont NEN). 

The DNA from each digest is fractionated on a 0.7% agarose gei and transfeired to nylon 
membranes (Nytran Plus, Scfal^cher & Schuell, Durham NH). Hytnidization is carried out foe 16 
hours at 40**C. To remove nonspecific signals, blots are sequentially washed at room temperature 
under conditions of up to, for exanq)le, 0.1 x saline sodium citrate and 0.5% sodium dodecyl sulfate. 
Hybridization patterns are visualized using autoradiogr^hy or an alternative iinaging means and 
con^)ared. 

X, Microarrays 

The linkage or synthesis of array danents upon a microarray can be achieved utilizing 
photoiithc^r^hy, piezoelectric printing (ink-J^ printing. See, &g., Baldeschweiter, supra .), mechanical 
microspotting technologies, and derivatives thereof. The substrate in each of the aforonentioned 
technologies shoidd be unifom and solid with a nmbporous surface (Schaia (1999), supra) . Suggested 
substrates include silicon, silica, gjass slides, glass dbips, and silicon wafers. Alternatively, a procedure 
analogous to a dot ch^ slot blot may also be used to arrange and link elonents to the surface of a 
substrate using thermal, UV, chemical, or mechanical tending procedures. A typical array may t>e 
produced using available methods and machines wdl known to those of ordinary skill in the art and may 
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contain any ^ipropriate number of elements. (See, eg., Schena, M et aL (1995) Science 270:467-470; 
Shalon, D. ei al. (1996) Gaiome Res. 6:639-645; Marshall, A. and J. Hodgson (1998) Nat Biotedinol. 
16:27-31.) 

FuH length cDNAs, Expressed Sequence Tags (ESTs), or fragments cr oligomers thereof may 
5 conpise the dements of the nucroarray. Fragments or oligomers suitable for hybridization can be 
sdected using software well known in the art such as LASERGENE software (DNASTAR). The array 
dements are hybridized with polynucleotides in a biological sani^le. The polynucleotides in the 
biological sanople are coiyugated to a fhidrescent label or other molecular tag for ease of detection. 
After hytiridization, nonih3rtiridized nucleotides ftom the biological sample are removed, anrt a 

10 fluorescoice scanner is xised to detect hybridization at csuch array el^nent Altematively, laser 
desorbtion and mass spectrometry may be used for detection of hybridization. Thede^reeof 
con^lementarity and the relative abundance of each polynucleotide which hylnidizes to an dement on 
the microarray may be assessed. In one embodiment, microarray preparation and usage is described in 
detail bdow. 

15 Tissue or Ceil Sample Preparation 

Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and 
poly(A)* RNA is purified using the oligo-(dT) cellulose method. Each poly(A)* RNA sample is 
reverse transcribed using MMLV reverse-transcriptase, 0.05 pg/pl oligo-(dT) primer (21mer), IX first 
strand buffe, 0.03 units/Ml RNase inhibitOT, 500 pM dATP, 500 pM dOTP, 500 nM tfTTP. 40 nM 

20 dCTP. 40 mM dCTP-Cy3 (BDS) or dCIP-Cy5 (Ameisham Pharmacia Biotech). The reverse 
transcription reaction is performed in a 25 ml volume containing 200 ng poly(A)* RNA with 
GEMBRIGHT kits (tocyte). Specific control poly(A)* RNAs are synthesized by in vitro transcription 
fiom non-coding yeast genomic DNA, Aftar incubation at 37° C for 2 hr, each reaction sanqjle (one 
with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodiimi hydroxide and 

25 incubated for 20 minutes at 85"* C to the stop the reaction and degrade the RNA. Samples are purified 
using two successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH LalxxatDries, Inc. 
(CLONTECH), Palo Alto CA) and after combining, botti reaction samples are ethanol precipitated 
using 1 ml of gjycogeatt (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The sample is 
then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook NY) and 

30 resuspended in 14 pi 5X SSC/0.2% SDS. 
Microarrav Preparation 

Sequences of the present invention are used to generate array elements. Each array element is 
amplified from bacterial cells containing vectors with cloned cDN A inserts, PCR amplification uses 
primers complementary to the vector sequences flanking the cDNA insert Array elements are 
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amplified in thirty cycles of PGR from an initial quantity of 1-2 ng to a final quantity greater tlian 5 
Mg. An^)lified array elements are then purified using SEPHACRYL-400 (Amersham Fhannacia 
Biotech). 

Purified array el^ents are immobilized on polym^-coated glass slides. Glass microscope 
5 slides (Coming) are cleaned by ultrasound in 0.1 % SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
Scientific-Products Corporation (VWR), West Chester PA), washed eartensiveiy in distilled water, and 
coated with 0.05% aminopropyl silane (Sigma) in 95% ettianol. Coated slides are cured in a 1 10°C 
oven. 

10 Array elements are applied to the coated gjass substrate using a procedure described in US 

Patent No. 5,807,522, incoiporated herein by refi^nce. 1 pi of tiie array element DNA, at an average 
concentradon of 100 ng/pl, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). 

15 Microairays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline (PBS) (TYopix, Inc., Bedford MA) for 30 minutes at 60°C followed by washes in 
0.2% SDS and distiUed water as before. 
Kybridization 

20 Hybridization reactions contain 9 pi of san^)le mixture consisting of 0.2 pg each of Cy3 and 

Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. Hie sample 
mixture is heated to 65° C for 5 minutes and is aliquoted onto the microarray surf ace and covered with 
an 1.8 cm^ coverslip. The arrays are transferred to a waterproof chamber having a cavity just slightly 
larger than a microscope slide. Hie chamber is kept at 100% humidity internally by the addition of 

25 140 pi of 5X SSC in a comer of the chamber. The chamber containing the arrays is incubated for 
about 6.5 hours at 60° C. Tlie arrays are washed for 10 min at 45°C in a first wash buffer (IX SSC. 
0.1% SDS), three times for 10 minutes each at 45**C in a second wash buffer (O.IX SSC), and dried. 
Detection 

Reportex-labeled hybridization complexes are detected with a microscope equipped with an 
30 Innova 70 mixed gas 10 W laser (CohCTent, Inc., Santa Clara CA) cspzble of generating spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. Hie excitation laser light is 
focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 
containing the array is placed on a computer-controlled X- Y stage on the microscope and raster- 
scanned past the objective. Hie 1 .8 cm x 1 .8 cm array used in the present example is scanned with a 
35 resolution of 20 micrometers. 
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In two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. 
Emitted light is split, based on waveloigth, into two photomultiplia- tube detectors (PMT R1477, 
Hamamatsu Photonics Systems, Bridgewatex NJ) coiresponding to tiie two nuorophores. i^opiiate 
filters positioned between the airay and the photomultiplier tubes are used to filter the signals. Tlie 
emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5. Each array is 
typically scanned twice, one scan pra* fluorophore using the appropriate filters at the laser source, 
although the ^paratus is capable of recording the spectra from both fluorophores simultaneously. 

Hie sensitivity of the scans is typically calibrated using the signal intensity generated by a 
cDNA control species added to the sample mixture at a known conc^itratioa A specific location on 
the array contains a complementary DN A sequence, aUowing the intensity of the signal at that 
location to be correlated with a weight ratio of hybridizing spedes of 1:100,000. Wh^i two sanq}les 
from different sources (e.g., representing test and control cells), each labeled witti a different 
fluorophore, are bybridized to a single array for the purpose of identifying genes that are differentially 
expressed, the calibration is done by labeling samples of the calibrating cDNA witii tiie two 
fluorophores and adding identical amounts of each to the hybridization mixture. 

Hie output of the photomultq>lier tube is digitized using a 12-bit R'n-835H analog-to-digital 
(A/D) conversion board (Analog Devices, Inc., Norwood MA) installed in an IBM-compatible PC 
computer. The digitized data arc displayed as an image where the signal intensity is m^ped using a 
linear 20-coIch" transfonnation to a i^eudocolor scale ranging from blue (low signal) to red (high 
signal). The data is also analyzed quantitatively. Where two different fiuorophcnes arc excited and 
measured simultaneously, the data are first conected fOT optical crosstalk (due to overl^>ping 
emission spectra) betweeai the flucHophcnes using each fluorophore' s emission spectrum. 

A grid is superin:q>osed over tbe fluorescence signal image such that the signal from each spot 
is centered in each elem^t of the grid. Hie fluorescence signal within eadi element is then integrated 
to obtain a numerical value corresponding to the av^age int^isity of the signal. The softwarc used 
for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). 
XI» Complementary Polynucleotides 

Sequ^ices con^lementary to the PRTS-eacoding sequences, or any parts thereof, are used to 
detect, decrease, » inhibit estpression of naturally occurring PRTS. Although use of oligonucleotides 
conprising from about 15 to 30 base pairs is descril>ed. essentially tbe same procedure is used with 
smaller or with larger sequence fragments. Approj^iate oligonucleotides are designed using OLIGO 
4.06 software (National Biosd^ices) and the coding sequence of PRTS. To inhibit transcription, a 
complemeaitary oligonucleotide is designed from the most unique 5' sequence and used to prevent 
promoter binding to the coding sequence. To inhibit translation, a conq)lementary oligonucleotide is 
designed to prevent riboscMnal binding to the PRTS-encoding transcript 
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XII. Expres^on of PRTS 

Expression and purificaticm of PRTS is acbieved using bacterial or virus-based eicpression 
syst^DS. Fch: eiqsression of PRTS in bacteria, cDNA is subcloned into an appropriate vector contaimng' 
an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA transcription. 
Examples of such promoters include, but are not limited to, tlie trp-lac (tad) hybrid promoter and the 
T5 OT T7 bacteriophage promoter in conjunction with tiie lac operator r^ulatory dement 
Recombinant vectors are transformed into suitable bacterial hosts, e.g., BL21(DE3). Antibiotic 
resistant bacteria express PRTS upon induction with iscprcpyl b^-D-thiogalactopyranoside (IPTG). 
Expression of PRTS in eukaryodc cells is achieved by infecting insect or TnaTnmaiian cdl lines with 
recombinant AutograpMca califomica nuclear polj^iedrosis virus (AcMNPV), commonly known as 
baculovirus. The nonessential polyhedrin gene of baculovinis is r^laced with cDNA encoding PRTS 
by either homologous recombination or bacterial-mediated transposition involving transfer plasmid 
intennediates. Viral infectivity is maintained and the strong polyhedrin promoter drives high levels of 
cDN A transcription. Recombinant baculovirus is used to infect Spodoptera frugiperda (Sf9) insect 
cdls in most cases, or human hepatocytes, in some cases. Infection of the latter requires additional 
genetic modificati(xis to baculovirus. (See Engelhard, E.K. a al. (1994) Proc. NatL Acad. SdL USA 
91:3224-3227; Sandig, V. ei al. (1996) Hum. Gei^ Ther. 7:1937-1945.) 

I 

In most expression systems, PRTS is synttiesized as a fusion protdn with, ag., glutathione S- 
transferase (GST) ot a peptide epitc^ tag, such as FLAG (x 6-His, permitting r^d, single-step, 
a£&nity-based purification of recombinant iusion protein Jfrom crude ceil lysates. GST, a 26-kilodaltan 
enzyme from Schistosoma j apnifiiciiTn^ enables the purification of fiision proteins on immobilized 
glutathione under conditions that maintain protein activity and antigenicity (Amersham I^iarmacia 
Biotech). Following purification, the GST moiety can be proteolytically cleaved firom PRTS at 
specifically oigineered sites. FLAG, an 8-amino acid p^itide, enables immunoaffinity purification 
using conmiercially available monoclCHial and polyclonal anti-FLAG antitxxlies (Eastman Kodak). 6- 
His, a stretch of six consecutive histidine residues, enables purification on metal-chdate resins 
(QIAGEN). Methods for protein expression and pinification are discussed in Ausubd (1995. supra, 
ch.l0andl6). Purified PRTS obtained by these methods can be used directly in the assays shown in 
Exan^les XVI, XVH, XVm, and XIX, where appUcable. 
XnL Functional Assays 

PRTS function is assessed by expressing the sequaices encoding PRTS at physiologically 
elevated levels in mammalian cdl culture systems. cDNA is sutKdoned into a mammalian expression 
vector containing a strong promoter that drives high levels of cDNA expression. VectOTS of choice 
include PCMV SPORT (Life Technologies) and PCR3.1 (Invitrogea, Carlsbad CA). both of which 



81 



wo 01/98468 PCT/USOl/19178 

comain tbe cytom^alovinis promoter. 5-10 of recombinant vector are transiently transfected into a 
human cdl line, for exanq>le, an endothelial or hematopoietic cell line, using either liposome 
formulations or dectropcration. 1-2 of an additional plasmid containing sequences encoding a 
marker protein are co-transfected. Expression of a marker jH-otein provides a means to distinguish 

5 transfected cdls from nontransfectedceUs and is a reliable predictor of cDNA express 

recombinant vectcff. Marker proteins of choice include. e.g.. Green Flucwescent Protein (GFP; 
Clontech), CD64, or a CD64-GFP fusion protein. Row cytometry (FCM), an antcmated, laser optics- 
based technique, is used to idoitify transfected cells expressing GFP or CD64-GFP and to evaluate the 
apoptoticstateof the ceils and otberceUular properties. FCM detects and quantifies the uptake of 

10 fluorescent molecules that diagnose events preceding or coinddem with cell cte^ These events include 
changes in nuclear DNA content as measured by staining of DN A with propidium iodide; changes in 
cen size and granularity as measured by fOTwardli^ scatter and 90 d^ee sidelight scatter; down- 
regulation of DNA synthesis as measured by decrease in bromodeoxyurictine uptake; alterations in 
esqjression of ceil surface and intracellular proteans as measured by reactivity with specific antibodies ; 

15 and alterations in plasma membrane conqposition as measured by the binding of fhiorescein-conjugated 
AnneadnVprotdn to the cell surface. MethcKis in flow cytometry are discussed in Ormerod, M.G. 
(1994) Row Cytometry , Oxford, New York NY. 

The influeace of PRTS oa gaie esxpression can be assessed using highly purified populations of 
cells transfected with sequences encoding PRTS and drther CD64 or CD64-GFP. CD64 and CD64- 

20 GFP are expressed on the surface of transfected cells and bind to conserved re^ons of human 

immunoglobulin G (IgG). TtansfiECted cells are efficiaifly separated from nontransfected ceDs using 
magn^c beads coated with either human IgG ot antibody against CD64 (DYNAL, Lake Success NY). 
mRNAcanbepurifiedfromthecdlsusmgmethods wdl known by those of skin in the ar^ Expression 

of mRN A encoding PRTS and other genes of interest can be analyzed by northern analysis or 
25 microarray techniques. 

XIV. Production of PRTS Specific Antibodies 

PRTS substantially purified using polyacrylamide gd electrophcresis (PAGE; see, e.g.. 

Harrington. M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 

immTuiize rabbits and to produce antit>odies using standard protocols. 
30 Altemativdy, the PRTS amino acid sequence is analyzed using LASERGENE software 

(DNASTAR) to determine r^ons of high inmmnogenicity. and a corresponding oligopeptide is 

synttesized and used to raise antibodies by means known to those of skill in the art. Methods for 

selection of appropriate epitopes, such as those near the C-temunus or in hydrophilic regions are well 

described in the art (See, e.g., Ausubei, 1995, supra, ch. 11.) 
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Typically* oligopeptides of about 15 residues in length are synthesized using an ABI 43 1 A 
peptide synthesizer (^plied Biosystems) using FMOC chemistry and coupled to KLH (Sigma- 
Aldridi, St Louis MO) by reaction with N-nialeimidohenzoyl-N-hydroxysuccumnide ester (MBS) to 
ino-ease immuDogenidty. (See, e^g., Ausubel, 1995, supra .") Rabbits are immunized with the 
5 oligopeptide-KLH complex in coiiQ>lete Fremd's adjuvant Resulting antisera are tested for antip^tide 
and anti-PRTS activity by, for example, binding the pqptide or PRTS to a substrate, blocidng with 1% 
BS A, reacting with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. 

XV. Purification of Naturally Occurring PRTS Using Specific Antibodies 
Naturally occurring or recombinant PRTS is substantially purified by immunoaftinity 

10 chromatogrf^hy using antilnxlies specific for PRTS. An immunoaffinity column is constructed by 

covalentiy coupling anti-PRTS antibody to an activated chromatographic resin, such as C>JBr-activated 
SEPHAROSB (Amersham Pharmacia Biotech). After the coupling, the resin is blocked and washed 
according to the manufacturer's instructions. 

Media containing PRTS are passed over the inmmnoaffinity column, and the column is washed 

15 under conditions that allow the preferential absorbance of PRTS (e.g., tu^ ionic strength buffers in the 
presence of detergent). The column is eluted uider conditions that disrupt antibody/PRTS binding 
(e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, suc^ as urea or tiiiocyanate ion), 
and PRTS is collected. 

XVI. Identification of Molecules Which Interact with PRTS 

20 PRTS, or biologically active fragm^ats thereof, are labded with ^^I Bolton-Hunter reagent 

(See, e.g., Bolton AE. and W.M. Hunter (1973) Biodim. J. 133:529-539.) Candidate molecules 
previously arrayed in the weOs of a nmlti-well plate are incubated with the labeled PRTS, washed, and 
any weHs with labeled PRTS conq>lex are assayed. Data obtained using different concentrations of 
PRTS are used to calculate values for the number, aftinity, and association of PRTS with the candidate 

25 molecules. 

Alternatively, molecules interacting with PRTS are analyzed using the yeast two-hybrid 
system as described in Fields, S. and O. Song (1989) Nature 340:245-246, or using commercially 
available kits based on the two-hybrid system, such as the MATCHMAKER system (Qontech). 

PRTS may also be used in the PATHCALLING process (CuraG^ Corp., New Haven CT) 
30 which enq}loys the yeast two-hybrid system in a highrtlirouglput manner to deternune all interactions 
between the proteins aKXXled by two large libraries of genes (Nandabalan, K. et aL (2(XX)) U.S. Patent 
No. 6,057,101). 

XVII. Demonstrationof PRTS Activity 
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Protease acti vrty is measured by the hydrdiysis of appropriate synthetic pq)tide substrates 
OKigugated with various <toomogemc molecules in which the degree of hydrdlysis is quantified by 
spectrpphotonaetric (ot fiuorom^c) abscrption of the released chronoophore (Beynon, R.J. and J,S. 
Bond (1994) Proteolytic Enzvmes: A Practical Approach. Oxfcrd University Press, New Ycrk NY, 
5 pp.25-55). Peptide substrates are designed accordii]^ to the category of protease activity 

eodopq>tidase (serine, cysteine, aspartic proteases, or metaBoproteases), aminopeptidase (leucine 
anmu^)eptidase), or carboxypeptidase (carboxyp^tidases A and B, procollagen C-proteinase). 
CommoDly used chromogens are 2-naphthylamine, 4-nitroaniline, and furjlaoylic acid. For example, 
arginine-P-naptbylamide can be used as a substrate for SEQ ID NO:3 (Fukasawa, K.M. et al. (1996) J. 

10 BioL Chem. 27 1 :3073 1-30735) and 4-phenylazobeiizyioxycarbanyl-Pro-Lai-Gly-Pro-r>-Arg can be 
used as a substrate for SEQ ID NO:4. In an attemative example, a substrate fear SEQ ID NO:9 would 
be 7-amino4-trifluorome^l coumarin-Phe-Pro-AFC. Assays are performed at ambient tenqjerature 
and contain an aliquot of the enzyme and the apprcpriate substrate in a suitable btifier. Reactions are 
carried out in an optical cuvette, and the increase^decrease in absorbance of the chromogen released 

15 duiizig hydrolysis of the peptide substrate is measured. The change in absorbance is proportional to the 
enzyme activity in the assay. 

An alternate assay for ubiquitin hydrolase activity measures the hydrolysis of a ubiquitin 
precursGT. The assay is performed at ambi^ temperature and contains an aliquot of PRTS and the 
z^lH-opriate substrate in a suitable buffer. For SEQ ID NO:l, ch^cally synthesized hiunan 

20 ubiquitin-valine may be used as substrate. Cleavage of the C-terminal valine residue from the substrate 
is monitCKed by capillary electrc^horesis (Franldin, K. et aL (1997) AnaL Biochem. 247:305-309). 

Altemativeiy, the ubiquitin {M-otease activity of SEQ ID NO:5 can be measured using the 
method of Sloper-Mould et al. ((1999) J. BioL Chem. 274:26878-26884). Aliquots cf PRTS are 
incubated with 5 ^l PS]-labeled ubiquitin-GST fusion substrate for 1 hour at 37 **C in an ^jpropriate 

25 buffer. San9>les are resolved by dlectrophcaresis on a 12% SDS-PAGE gd. Ubiquitin deavage is 
monitored by fhiorogr^hy of the geL 

Alternatively, the activity of SEQ ID NO:10, for exan^le, can be measured by the method of 
Colige ^ al. (1999, J. Biol. Chem. 270:16724-16730). An aliquot of PRTS is incubated with amino 
procollagen type I substrate radioactively labeled only in tlie prop^tide in a 250 pi reaction containing 

30 50 mM sodium cacodylate, pH 7,5, 200 mM KCl, 2 mM CaCl, 2^ mM NEM, 0.5 mM PMSF, and 
0.02% Brij (standard assay buffer). After 16 h at 26 °C, the reaction is stopped by adding 50 \xl of 
EDTA solution (0.2 M EDTA, pH 8, 0.5% SDS, 0.5 M DTT) and 300 nl of 99% ^hanoL The 
samples are kept for 30 min at 4 **C and centrifuged for 30 min at 9500 g. Collagen and imcleaved 
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radioactive pN-collagen substrate are p^^ed, whereas the freed annuo prcpeptides remained in 
solution. An aliquot of the supernatant is assayed by liquid scLotillation spectrometry. 

In the altemativer an assay for protease activity takes advantage of JQuorescence resonance 
energy transfer (FRET) that occurs when one donor and one acceptor fiuorophcre with an appropriate 
5 spectral overlap are in dose proxiniity. A flexible p^tide linker containing a cleavage site specific for 
PRTS is fused between a red-shifted variant (RSGFP4) and a blue variant (BFP5) of Green Fluorescent 
Protein. This fusion protein has- spectral properties that suggest energy transfer 
to RSGFP4. When the fusion protein is incubated with PRTS, the substrate is deaved, and the two 
fluorescent proteins dissociate Hiis is acconopanied by a marked decrease in energy transfer which is 
10 quantifiedby comparing the emission spectra before and after the addition of PRTS (Mitra, R.D. etal. 
(1996) Gene 173:13-17). Tins assay can also be performed in living cdls. In this case the fluorescent 
substrate protein is expressed constLtutively in cells and PRTS is introduced on an indudble vector so 
that FRET can be monitored in the presence and absence of PRTS (Sagot, I. etal. (1999) FEES Lett 
447:53-57). 

15 In yet another alternative, an assay for PRTS dipeptidase activity measures the hydrolysis 

activity of PRTS ona variety of dipeptides such as leukotrieneD4(Kozak,E. and S.Tate (1982) J. 
BioL Chem. 257:6322-6327). or hydrolysis of the beta-lactam ring of antibiotics such as penum and 
carb^)eaiem (Campbell et aL. (1984) L BioL Chem. 259:14586-14590). 
XVIIl. Identification of PRTS Substrates 

20 PhagedisplayUbraries can be used to identify optirnal substrate sequences for PRTS/ 

random hexamerfoUowed by a linker and a known antibody ^itc^ is donedas anN-terminal 
extension of gene in in a filamentous phage library. Gene m codes fCH* a coat protein^ and the epitope 
will be displayed on the surface of each phage particle The Uhrary is incubated with PRTS under 
proteolytic conditions so that the epitope wiD be removed if the hexamer codes for a PRTS cleavage 

25 site. An antibocfy that recognizes the epitope is added along with immobilized protem Uocleaved 
phage, wMchstin bear the epitope^ are removed by centrifugation. I^iage in the supernatant are then 
anq>li£led and undergo several more rounds of screeniiig. Individual phage clones are then isolated and 
sequenced. Reaction kinetics for these peptide substrates can be studied using an assay in Exanq)le 
XVn, and an optimal cleavage sequence can be derived (Kb, S.H. et aL (1997) J. BioL Chem. 

30 272:16603-16609). 

To screen for in vivo PRTS substrates, this method can be expanded to screen a cDNA 
expression library displayed on the surface of phage particles (T7SE1£CT 10-3 Phage display vector, 
Novagen. Madison WI) or yeast cells (pYDl yeast display vectCM- kit, Invitrogen, Carlsbad CA). In this 
case, ^tire cDNAs are fused between Gene in and the ^^propriate epitope. 
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XIX. IdentilQcati nof PRTS InhlbitoTS 

CompoiiDds to be tested are arrayed in the of a nuitti- well plate in varying concentratiQns 
along with an appropriate buffer and substrate^ as desoibed in the assays inExanqxleXVZI. PRTS 
activity is measured for each weiD and the ability of each con^xxind to inhibit PRTS activity can be 

determined, as wen as the dos&-responseidnetics. This assay could also be used to ideadiy molecules 
which enhance PRTS activity. 

In the alternative^ phage display libraries can be used to screen for peptide PRTSinhibitors. 
C amti f late s are found anaong pqptides wfaitA bimi tightly to a protease. In ^ case, multi-well plate 
wells are coated with PRTS and incubated with a random pqptide phage display library or a cyclic 
peptide Ubrary (Koivunai, E. et aL (1999) Nat BiotechnoL 17:768-774). Unbound phage are washed 
away and sheeted phage an^ilified and rescreened for several more rounds. Candidates are tested fcr 
PRTS inhibitory activity using an assay described in Exanple XVn. 

Various modifications and variations of the described methods and systems of the invention win 
be apparent to those skilled in the art witixMit departing from the scope and spirit of the invention. 
Although the invention has been described in connection with certain embodiments, it should be 
understood that the inv^oii as claimed should not be unduly limited to such specif embodiments. 
Indeed, various modifications of the described modes for carrying out the invention which are obvious 
to those skOled in molecular biology or related fields are inteaided to be within the scope of the following 
claims. 
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1. An isolated polypeptide selected from the group consistiiig of: 

a) a pol)peptide compiisiBg an amino acid sequence selected hrom the group consisting of 
5 SEQn:)NO:l-21, 

b) a polypeptide comprising a naturally occutring amino acid sequence at least 90% identical 
to an amino acid sequence selected from the group consisting of SEQ ID NO:l-21, 

c) a biologically active fragment of apol3^ptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-21, and 

10 d) an immuno genic fragment of a polypeptide having an amino acid sequence selected from 

the group consisting of SEQ ID NO: 1 -21 . 

2. An isolated polypeptide of claim 1 selected from the group consisting of SEQ ID NO:l- 

21. 

15 

3. An isolated polynucleotide encoding a polypeptide of claim 1. 

4. An isolated polynucleotide encoding a polypeptide of claim 2. 

ft 

20 5. An isolated polynucleotide of claim 4 sheeted from the group consisting of SEQ ID 

NO:22-42. 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

25 

7. A cell transformed with a recombinant polynucleotide of daim 6. 

8. A transgenic organism comi^ising a recombinant polynucleotide of daim 6. 

30 9. A method for producing a polypeptide of claim 1, the method comprising: 

a) culturing a ceD under conditions suitable fOT expression of the polypeptide, wherein said 
cell is transformed with a recombinant polynucleotide, and said recombinant polynucleotide 
comprises a promoter sequence operably linked to a polynucleotide encoding the polypq)tide of daim 
1, and 

35 b) recovering the polypqptide so expressed. 
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10. An isolated antibody which specifically binds to a polypeptide of claim 1 . 



1 1 . An isolated polynucleotide selected from the group consisting of: 

a) a polynucleotide comprising a polynucleotide sequence selected fiom the group consisting 
5 ofSEQIDNO:22-42, 

b) a polynucleotide comprising a naturaUy occurring polynucleotide sequence at least 90% 
identical.to a polynucleotide sequence selected from the group consisting erf SEQ ID NO:22-42, 

c) a polynucleotide complementary to a polynucleotide of a), 

d) a polynucleotide compl^entary to a polynucleotide of b), and 
10 e) an RNA equivalent of a)-d). 

12. An isolated polynucleotide con^msing at least 60 contiguous nucleotides of a 
po]3aiucleotide of claim 1 1. 

13. A method for detecting a target polynucleotide in a sanq>le. said target polynucleotide 
having a sequence of a polynucleotide of claim 1 1 , the method comprising: 

a) hybridizing the sample witii a probe con^prismg at least 20 contigucms nucleotides 
comprising a sequence complementary to said target polynucleotide in flae sample, and which probe 
specifically hylaidizes to said target polynucleotide, under conditions whereby a hyMdization 
complex is formed between said probe and said target polynucleotide or fragments tiiereof, and 

b) detecting the presence or absence of said hybridization complex, and, optionally, if 
present, the amount thereof. 

14. A method of claim 13. wherein the probe con^jrises at least 60 contiguous nucleotides. 

15. A method for d^ecting a target polynucleotide in a sanq>le, said target polynucleotide 
having a sequence of a polynucleotide of claim 11 . the method comprising: 

a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction 
amplification, and 

b) detecting the presence or absence of said amplified target polyimcleotide or fiagment 
thereof, and, optionally, if present, the amount thereof. 

16. A composition comprising a polypq)tide of claim 1 and a pharmaceutically acceptable 
excipienL 
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17. A compositioii of daim 1 6, whexein tbe polypeptide has an amino acid sequence selected 
from the group consisting of SEQ ED NO:l-21. 

18. A method for treating a disease or condition associated with decreased expression of 

5 fimctional PRTS, comprising administering to a patient in need of such treatment the composition of 
claim 16. 



19. A m^od for screening a compound for effectivMiess as an agonist of a polypeptide of 
claim 1 , the method conq>rising: 

10 a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting agonist activity in the sample. 

20. A composition comjxising an agonist compound Identified by a method of claim 19 and 
a pharmaceutically acceptable excq>ient 

15 

21. A method for treating a disease or condition associated wim decreased expression of 
functional PRTS, con:q)rising administering to a patient in need of such treatment a con?>osition of 
daim 20. 

20 22. A method for sareening a con^)oimd for effectiveness as an antagonist of a polypeptide 

of claim 1, the method comprising: 

a) exposing a san^le comprising a polypeptide of daim 1 to a compound, and 

b) detecting antagonist activity in ttie sample. 

25 23. Acompositioncomiwisinganantagonistcompouiididentifiedby a method of claim 22 

and a pharmaceutically acceptable excipient 

24. A method for treating a disease or condition assodated with overexpiession of functional 
PRTS. comprising administering to a padeat in need of such treatment a conqxjsition of daim 23. 

30 

25. A m^hod of screeniiig for a compouiMl that specificaUy binds to the polypeptide of dai^ 
1, said method conqxrising the steps of: 

a) combining the polypeptide of claim 1 with at least one test con^Kmnd under suitable 
conditions, and 



124 



wo 01/98468 



PCT/USOl/19178 



b) detectingbmduigofthepolypqTtideGf claim 1 to tte 
CQaq>oimd that specifically binds totbepolypq)tideof claim 1, 

26. A methcd of screeoiiig for a coizqx>uiKl that modulates 
5 claim 1» said method conqpiising: 

a) combining the polypeptide of claim 1 with at least one test conqxMmd under conditions 
permissive for the activity of the polyp^tLde of claim 1, 

b) assessingtheactivity of the polypq>tide of claim 1 in the presence of the test coai^^ 

c) conq)aring the activity of the polypeptide of claim 1 in the presence of the test co^^xxmd 
10 withtheactivity of thepolypq)tideof claim 1 intheabsenceof the test compound, vtlieretn a <d)ange in 

the activity of the polypq)tide of claim 1 in the preseDc& of the test compound is indicative of a 
compound that modulates the activity of the xx>lypeptide of claim 1 . 

27. A method for screening a compound for effectiveness in altering expression of a targ^ 
15 polynucleotide, wherein said target polynucleotide conqirises a sequence of claim 5, the method 

comprising: 

a) exposing a sanq)le com^xrising the target polynucleotide to a compound, imder conditions 
suitable f<H- the eaqn-ession of the target polynucleotide, 

b) detecting altered esqpressian of the target polynucleotide, and 

20 c) comparing the expression of the target polynucleotide in the presoioe of varying amtounts of 

the compound and in the absence of the compound. 

28. A method for assessing toxicity of a test compound, said mettiod con^jrising: 
a) treating a biological sample containing nud^c acids with the test compound; 

25 b) hybridizing the nucleic acids of the treated biological sample witii a probe compnsing at 

least 20 contiguous nucleotides of a polynucleotide of claim 1 1 und^ conditions whereby a specific 
hybridization complex is formed between said probe and a target polynucleotide in the biological 
sample, said target polynucleotide conqnising a polynucleotide sequence of a polynucleotide of claim 
11 or fragment thereof ; 

30 c) quantifying the amount of hybridization conq>lex; and 

d) comparing the amount of hybridization complex in the treated biological sample with the 
amount of hybridization complex in an untreated biological sanq)le, wherein a difference in the 
amount of hybridization complex in the treated biological sample is indicative of toxicity of the test 
compound. 
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29. Adiagnostictestfor a <x)iidition or disease associated with the e3^^ 
biological sanqile conqnising the steps of: 

a) combining the biological sample with an antibody of claim 10. under conditions suitable 
5 for the antibody to bind the polypeptide and form an antibody rpolypeptide complex; and 

b) detectingthecomplex^ wherein the presence of the complex coirdlateswit^ 
of the polypeptide in the biological sample. 

30. Ihe antibody of claim 10, wherein the antibody is: 
10 a) a chimeric antibody, 

b) a single chain antibody, 

c) a Fab fragment, 

d) a F(ab')2 fragment, or 

e) a humanized antilxxly. 

15 

31. A composition comprising an antibody of claim 10 and an acceptable exdpient 

32. A method of diagnosing a condition or disease associated witii the expression of PRTS in 
a subject, comprising administering to said subject an effective amrHint of tiie conqx>sition of claim 

20 31. 

33. A composition of claim 31 , whsein the antibody is labeled. 

34. Amethodof diagnosing a coiKiition or disease associated with the expression of PRTS in 
25 a subject, comprising administering to said subject an effective amount of the conqxtsition of claim 

33. 

35. A method of prq)aring apolyclonal antibody with the specificity of the antibody of claim 
10 comprising: 

30 a) immunizing an animal with a polypeptide having an amino add sequence selected from 

the group consisting of SEQ ID NO: 1-21, or an immunogenic fragment thereof, und^ conditions to 
elidt an antibody response; 

b) isolating antibodies from said animal; and 
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c) screening the isolated antibodies with the polypeptide, thereby identifying a polyclonal 
antibody which binds specifically to a polypq>tide having an amino add sequence selected from the 
group consisting of SEQ ID NO: 1-21 . 

36. An antibody produced by a method of claim 35. 

* * . — 

37. A composition comprising the antibody of claim 36 and a suitable carrier. 

38. A method of making a monoclonal antilxxly with the specificity of the antibody of claim 
10 comprising: 

a) immunizing an animal with a polyp^tide having an amino add sequence selected from 
the group consisting of SEQ ID NO: 1-21 , or an immunogenic fiagment thereof, lUKte conditions to 
elidt an antibody response; 

b) isolating antibody producing cells fix)m the animal; 

c) fusing the antibody producing cdls witii immortalized cells to form monoclonal antibody- 
producing hybndoma cells; 

d) culturing the hybridoma cells; and 

e) isolating from the culture moDoclon^ antibody which binds specifically to a polypeptide 
having an amino acid sequence selected fi-om the group consisting of SEQ ID NO:l-21. 

39. A monodonal antibody produced by a method of claim 38. 

40. A composition comprising the antibody of daim 39 and a suitable carrier. 

41 . The antibody of daim 10, wherein the antibody is produced by screening a Fab 
ejq>ression library. 

42. Hie antibody of claim 10, wherein the antibody is produced by screening a recombinant 
immunoglobulin library. 

43. A method for detecting a polyp^tide having an amino add sequence selected from the 
group consisting of SEQ ID NO: 1-21 in a sanq)le, comprising the steps of: 

a) incubating the antibody of claim 10 witti a sample under conditions to allow specific 
binding of the antibody and the polypeptide; and 
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b) detecting specific binding, whearein specific binding indicates the presence of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l-21 in 
the sample. 

44. A method ofpurifying a polypeptide having an amino acid sequence selects 
group consisting of SEQ ID NO: 1-21 from a sample, the method comprising: 

a) incubating the antibody of claim 10 with a sample mnto conditions to aDow specific 
binding of the antibody and the polypeptide; and 

b) s^arating the antibody from the sample and obtaining the purified polype 
amino acid sequ^ce selected from the group consisting of SEQ ID NO: 1 -21 . 



45. A polypeptide of daim 1, comprising the amino acid sequ^ce of SEQ ID NO:l. 

46. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:2, 

47. A polypq>tide of daim 1, comprising the amino acid sequence of SEQ ID NO:3. 

48. A polypeptide of daim 1, comprising the amino add sequence of SEQ ID NO:4. 

49. A polypeptide of daim l,conq)risingtheaminoaddsequMiceof SEQIDNO:5. 

50. A polyp^tide of claim 1 , comprising the amino add sequence of SEQ ID NO:6. 

51. Apolypq)tideofciaim 1, conqirising the amino add sequence of SEQ ID NO:7. 

52. A polypeptide of daun 1, comprising tiie amino add sequence of SEQ ID NO:8, 

53. A polypq>tide of daim 1, con^jrising the amino add sequence of SEQ ID NO:9. 

54. A polypeptide of daim 1, con^irising the amino add sequence of SEQ ID NO:10. 

55. A polypeptide of daim 1. comprising the amino add sequaice of SEQ ID NO:l 1. 



128 



wo 01/98468 PCT/US01/19178 

56. A polypqxide of claim 1 . comprising the amino acid sequence of SEQ ID NO: 12. 

57. A polypeptide of claim 1, comprising tbe amino acid sequence of SEQ ID NO:13. 

5 58. Apolypeptideof claim l,con^irising the aniino acid sequence of SEQ ID NO: 14. 

59. A polyp^tide of claim 1, comprising the amino acid sequence of SEQ ID NO:15. 

60. A polypeptide of claim 1, con^msing the amino acid sequ^ce of SEQ ID NO:16. 

10 

6 1 . A polypeptide of claim 1 , conqjrising the amino acid sequence of SEQ ID NO: 1 7. 

62. A polyp^tide of claim 1, comprising the amino add sequence of SEQ ED NO:18. 
15 63. A polypeptide of claim 1 . comprising the amino add sequence of SEQ ID NO: 1 9. 

64. A polypeptide of claim 1, con^sing the amino acid sequence of SEQ ID NO:20. 

65. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:21. 

20 

66. A polynudeotide of daim 1 1 , comprising the polynudeotide sequence of SEQ ID 

NO:22. 

67. A polynucleotide of daim 1 1 , comprising the polynudeotide sequence of SEQ ID 

25 NO:23. 

68. A polynucleotide of daim 11, comprising the polynudeotide sequence of SEQ ID 

NO:24. 

ft 

30 69. A polynudeotide of daim 1 1 , comprising the polynudeotide sequence of SEQ ID 

NO:25. 

70. A pol3fTiucleotide of claim 1 1 , comprising the polynudeotide sequence of SEQ ID 

NO:26. 
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71 . A polynucleotide of claim 11, composing the polynucleotide sequence of SEQ ID 

NO:27. 

72. A polynucleotide of claim 1 1, compising the polynucleotide sequence of SEQ ID 

5 NO:28. 

73. A polynucleotide of claim 1 1 , composing the polynucleotide sequ^ce of SEQ ID 

NO:29. 

10 74. A polynucleotide of claim 1 1» comprising the polynucleotide sequence of SEQ ID 

NO:30. 

75. A polynucleotide of claim 1 1, comprising the polynucleotide sequence of SEQ ID 

NO:3l. 

15 

76. A polynucleotide of claim 11, comprising the polynucleotide sequence of SEQ ID 

NO:32. 

77. A polynucleotide of claim 1 1, comj^sing the polynucleotide sequence of SEQ ID 

20 NO:33. 

78. A polynucleotide of claim 1 1, comprising the polynucleotide sequence of SEQ ID 

NO:34. 

25 79. A polynucleotide of claim 1 1 , comprising the polynucleotide sequence of SEQ ID 

NO:35. 

80. A polynucleotide of claim 1 1, comprising the polynucleotide sequence of SEQ ID 

NO:36. 

30 

8 1 . A polynucleotide of claim 1 1 . comprising the polynucleotide sequ^ce of SEQ ID 

NO:37. 
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82. A polynucleotide of claim 1 1 , compnsing tiie polynucleotide sequence of SEQ ID 

NO:38. 

83. A polynucleotide of claim 1 1 , comprising the polynucleotide sequence of SEQ ID 

5 NO:39. 

84. A polynucleotide of claim 1 1 , compnsing the polynucleotide sequence of SEQ ID 

NO:40. 

10 85. A polynucleotide of claim 1 1 , comprising the polynucleotide sequence of SEQ ID 

NO:41. 

86. A polynucleotide of claim 1 1 . comprising the polynucleotide sequence of SEQ ID 

NO:42. 
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<210> 1 

<211> 232 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 275791CD1 
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Cys 
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35 
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Lys 


Thr 


Pro 
40 
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Gin 
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45 
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Trp 
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Gin 


Gin 


Tyr 
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Met 


Glu 


Arg 


Glu 


Arg 
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50 
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60 
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Gin 


Glu 


Leu 


Gin 
65 


Gin 


Ala 


Leu 


Ala 


Gin 
70 


Ser 


Leu 


Gin 


Glu 


Gin 
75 


Glu 


Ala 


Trp 


Glu 


Gin 
80 


Lys 


Glu 


Asp 


Asp 


Asp 
85 


Leu 


Lys 


Arg 


Ala 


Thr 
90 


Glu 


Leu 


Ser 


Leu 


Gin 


Glu 


Phe 


Asn 


Asn 


Ser 


Phe 


Val 
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Ala 


Leu 
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V3XU 
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GlU GlU 
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lie 
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Ser 
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Ser Gly His 
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He 


Lys 


Lys 


Gin 
170 


Ala 


Trp Phe 


Thr 


Ser 


Lys 


lie 


Gin 


GlU 
185 


Ala 


Ala Val 


Gin 


Gly 


Tyr 


He 


Phe 


Phe 
200 


Tyr 


Met His 


Lys 


Leu 


GlU 


Thr 


GlU 


Lys 
215 
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Ser Gin 


Ser 


Lys 


T!hr 


Thr 


Arg 


Gin 


Ala 


Ser 
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<220> 
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<223> Incyte ID 


No: 


1389845CD1 




<400> 2 












Met Pro Lys Tyr 


Leu 


Gly 


Gly Gly 


Cys 


1 


5 










Ala Glu Arg Arg 


Val 


Tyr 


Ser 


Leu Gly 
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Thr His Gin Glu 


Leu 


Arg 


Thr 


Asp 
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- 35 










Thr Gly Trp Cys 
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Trp 


Cys 
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Ser Ser Pro Cys 


Tro 
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Gin 


Thr 
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Ala Thr Leu Thr 


Gin 
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Ser 


Leu 
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Ser Cys Leu Gin 


Val 


Leu 


Leu 


Leu 


Leu 




95 










Thr Gin Gly Arg 


Lys 


Ser 


Ala 


Ala 


Cys 
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Ser Arg He Val 


Gly Gly 


Arg Asp 


Gly 




125 










Trp Gin Ala Ser 


He 


Gin 


His 


Arg 


Gly 
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Ser Leu He Ala 


Pro 


Gin 


Trp 


Val 


Leu 




155 








Pro Arg Arg Ala 


Leu 


Pro 


Ala 


Glu 


Tyr 
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Leu Arg Leu Gly 


Ser 


Thr 


Ser 


Pro 


Arg 




185 








Arg Arg Val Leu 


Leu 


Pro 


Pro 


Asp 


Tyr 
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Gly Asp Leu Ala 


Leu 


Leu 


Gin 


Leu 


Arg 




215 








Ala Arg Val Gin 


Pro 


Val 


Cys 


Leu 


Pro 




230 








Pro Pro Gly Thr 


Pro 


Cys 


Arg 


Val 


Thr 




245 










Pro Gly Val Pro 


Leu 


Pro 


Glu 


Trp 


Arg 




260 








Val Pro Leu Leu 


Asp 


Ser 


Arg 


Thr 


Cys 




275 










Gly Ala Asp Val 


Pro 


Gin 


Ala 


Glu 


Arg 
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uys 


Arg 


Asn 


Ala 


Glu 


Thr 


u u 










tor- 

135 


He 


Ser 


Val 


Val 


Ser 


Hxs 


145 










150 


Tyr 


He 


Ser 


Asp 


Val 


Tyr 


160 










165 


Tyr 


Asn 


Asp 


Leu 


Glu 


Val 


175 










180 


Ser 


Asp 


Arg Asp Arg 


Ser 


190 










195 


Glu 


He 


Phe 


Asp 


Glu 


Leu 


205 








210 


Leu 


Ser 


Thr 


Glu 


Val 


Gly 


220 










225 



Cys 


He 


Pro 


Gly Pro 


Trp 


10 








15 


His 


Gin 


Asp 


Lys Ser 


Arg 


25 








30 


Arg 


Thr 


Thr 


Glu Gly Val 


40 








45 


Trp 


Ala 


Arg 


Thr Leu 


Leu 


55 








60 


Val 


Gin 


Ala 


Leu Gly 


Ser 


70 








75 


Asp 


Arg 


Met 


Arg Gly 


Val 


85 








90 


Val 


Leu 


Gly Ala Ala 


Gly 


100 








105 


Gly Gin 


Pro 


Arg Met 


Ser 


115 








120 


Arg 


Asp 


Gly Glu Trp 


Pro 


130 








135 


Ala 


His 


Val 


Cys Gly Gly 


145 








150 


Thr 


Ala 


Ala 


His Cys 


Phe 


160 






165 


Arg 


Val 


Arg Leu Gly Ala 


175 








180 


Thr 


Leu 


Ser 


Val Pro 


Val 


190 








195 


Ser 


Glu 


Asp 


Gly Ala Arg 


205 








210 


Arg 


Pro 


Val 


Pro Leu 


Ser 


220 








225 


Val 


Pro 


Gly Ala Arg 


Pro 


235 








240 


Gly 


Trp 


Gly 


Ser Leu 


Arg 


250 








255 


Pro 


Leu 


Gin 


Gly Val 


Arg 


265 








270 


Asp 


Gly 


Leu 


Tyr His 


Val 


280 








285 


He 


Val 


Leu 


Pro Gly 


Ser 


295 






300 
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Leu Cys Ala Gly Tyr Pro Gin Gly His 

305 

cys Thr Gin Pro Pro Gin Pro Pro Glu 

320 

His Pro Pro Ser Leu Asn Ser Arg 1!hx 

335 

Ala Gin Asp Pro Gly Leu Gin Pro Arg 

350 

Trp Asn Pro Glu Asn 

365 

<210> 3 

<211> 416 

<212> PRT ■ 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 1726609CD1 

<400> 3 

Met Trp Gly Arg Tyr Asp lie Val Phe 

1 5 
lie Val Ala Met Glu Asn Pro Cys Leu 

20 

lie Leu Glu Ser Asp Glu Phe Leu Val 

35 

Val Ala His Ser Trp Phe Gly Asn Ala 

50 

Glu Glu Met Trp Leu Ser Glu Gly Leu 

65 

Arg lie Thr Thr Glu Thr Tyr Gly Ala 

80 

Thr Ala Phe Arg Leu Asp Ala Leu His 

95 

Gly Glu Asp Ser Pro Val Ser Lvs Leu 

110 

Gly Val Asn Pro Ser His Leu Met Asn 

125 

Gly Tyr Cys Phe Val Tyr Tyr Leu Ser 

X40 

Gin Arg Phe Asp Asp Phe Leu Arg Ala 

155 

Phe Thr Ser Val Val Ala Gin Asp Leu 

170 

Phe Phe Pro Glu Leu Lys Glu Gin Ser 

185 

Leu Glu Phe Glu Arg Trp Leu Asn Ala 

200 

Glu Pro Asp Leu Ser Gin Gly Ser Ser 

215 

Ala Leu Phe Gin Leu Trp Thr Ala Glu 

230 

Ala Ser Ala Ser Ala lie Asp lie Ser 

245 

Thr Ala Leu Phe Leu Asp Arg Leu Len 

260 

Gin Glu Val Val Met Ser Leu Ser Lys 

275 

Asp Ser Met Asn Ala Glu lie Arg lie 

290 

Val Arg Asn Asp Tyr Tyr Pro Asp Leu 

305 

Leu Glu Ser Gin Met Ser Arg Met Tyr 

320 

Asp Leu Cys Thr Gly Ala Leu Lys Ser 

335 
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Gin Val 
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Ser 
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Ala Gin 
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Trp 
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Phe 
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85 
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Arg 


Gin 


Met 


Lys 


Leu 


Leu 
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105 


Gin 


Val 


Lvs 


Leu 


Glu 


Pro 


115 








120 


Leu 


Phe 


Thr 


Tyr 


Glu 


Lys 


130 










135 


Gin 


Leu 


Cys 


Gly Asp 


Pro 


145 










150 


Tyr 


Val 


Glu 


Lys 


Tyr 


Lys 


160 










165 


Leu 


Asp 


Ser 


Phe 


Leu 


Ser 


175 










180 


Val 


Asp 


Cys 


Arg 


Ala 


Gly 


190 










195 


Thr 


Gly 


Pro 


Pro 


Leu 


Ala 


205 










210 


Leu 


Thr 


Arg 


Pro 


Val 


Glu 


220 










225 


Pro 
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Asp 


Gin 


Ala 


Ala 


235 










240 
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Arg 
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Phe 
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250 
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Asp 


Gly 


Ser 


Pro 
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Pro 


265 
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Ser 
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280 
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Arg 
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He 
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Phe 
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Glu 
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Phe 


340 
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Tyx Gin Thr Gin Gly 

350 

Gin Gin lie Leu Ser 

365 

Ser Glu Pro Ser Thr 

380 

Ser Asp Ala Gin Ala 

395 

Ala lie Ser Leu Arg 

410 

<210> 4 
<211> 714 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc__f eature 
<223> Incyte ID No: 
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Ser 
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Glu 
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Val 


Gin 


Thr 


Lys 
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Glu 


Glu 


Val 


Thr 


Tyr 
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Val 
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Tyr 


He 
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Val 


Ser 
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Asp 


Lys 


Arg 
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Ser 
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Asp 


He 


Phe 


Glu 


Arg 
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Gly 
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He 
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Met 


Gly 
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Arg 
185 
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He 
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200 


Asp 
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215 


Ser 
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Glu 
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230 
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Glu 
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Thr 
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245 


Pro 
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Tyr 


Phe 
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Arg 


Arg 
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Asn 
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Leu 
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Gly 
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Met 
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Thr 


Ala 


Lys 
320 


Asp 
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Gin 


Lys 
335 



Arg Leu His Pro Asn 

355 

Gin Gly Leu Gly Ser 

370 

Glu Leu Gly Lys Ala 
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Leu Leu Leu Gly Asp 
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Asp Val Asn Val Ser 
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Val 
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Val 
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Lys 


Glu 


Val 
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Arg 
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Glu 
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He 


Val 


His 


Leu 
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Glu 
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Arg 


Arg 


Tyr 
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Gly 


Leu 


His 


Leu 








190 


Met 
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Arg 


Met 
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Glu 


Asp 
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Thr 


Glu 


Glu 


Leu 


He 
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Gly 


Met 


Leu 


Gly 


He 
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Ala 


Leu 


Ala Asp 


Val 










105 


Leu 


Asp 


Phe 


Pro 


Gin 










120 


Ala 
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Thr 


Glu 


Ala 










135 


Met 


Ser 


Met 


Arg 


Gly 










150 


Glu 


Thr 


Cys 


Asp 


Leu 










165 


Leu 


Glu 


Lys 


Ser 


He 










180 


Pro 


Glu 


Gin 


Val 


Gin 










195 


Ser 


Glu 


Leu 


cys 


He 










210 


Thr 


Phe 


Leu 


Val 


Phe 










225 


Asp 


Phe 


He 


Asp 
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240 


He 


Thr 


Leu 


Lys 


Tyr 
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Cys 


He 


Pro 


Glu 


Thr 








270 


Arg 


cys 


Lys 


Glu 


Glu 










285 


Leu 


Arg 


Thr 


Lys 


Val 
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Asp 


Phe 


Val 


Leu 


Glu 










315 


Thr 


Ala 


Phe 


Leu 


Asp 










330 


Glu 


Ala 


Glu 


Arg 


Glu 
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xle 


Leu 


Asn 


Leu 
350 


GJ.U 


Tyr 


Asp 


Gly 


Lys 
365 


Tnr 


Gin 


Thr 


Glu 


Glu 
380 


Lys 


Glu 


Tyr 


Phe 


Pro 
395 


Thr 


Tyr 


Gin 


Glu 


Leu 
410 


Ala 


His 


Val 


Trp 


Asn 
425 


Lys 


Ala 


Thr 


Gly 


Glu 
440 


Pro 


Arg 


Glu 


Gly 


Lys 
455 


Pro 


Gly 


Cys 


Leu 


Leu 
470 


Ala 


Leu 


Val 


Val 


Asn 
485 


Leu 


Leu 


Arg 


His 


Asp 
500 


His 


Val 


Met 


His 


Gin 
515 


Ser 


Gly 


Thr 


Asn 


Val 
53 0 


Met 


Leu 


Glu 


Asn 


Trp 
545 


Ser 


Lys 


His 


Xyr 


Lys 
560 


Glu 


Lys 


Leu 


Val 


Ala 
575 


Leu 


Arg 


Gin 


He 


Val 
590 


Asn 


Thr 


Ser 


Leu 


Asp 
605 


Glu 


He 


Leu 


Gly Val 










620 


Thr 


Phe 


Gly 


His 


Leu 
635 


Tyr 


Leu 


Trp 


Ser 


Glu 
650 


Phe 


Lys 


Lys 


Glu 


Gly 
665 


Arg 


Asn 


Leu 


He 


Leu 
680 


Met 


Leu 


His 


Asn 


Phe 
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Met 


Ser 


Arg 


Gly 
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<220> 
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<223> Incyte ID No: 

<400> 5 

Met Phe Ala Pro Ser 
1 5 
Ser Lys Gly Arg Lys 

20 

Ser Gin Tyr He Ser 

35 



Lys 
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Lys 


G±U 


Cys 

•ace 

355 


Xle 


Asn 


Ala 


Trp 


Asp 
370 


Leu 


Lys 


Tyr 


Ser 


He 
385 


He 


Glu 


Val 


Val 


Thr 
400 


Leu 


Gly 


Leu 


Ser 


Phe 
415 


Lys 


Ser 


Val 


Thr 


Leu 
430 


Val 


Leu 


Gly 


Gin 


Phe 








445 


Tyr 


Asn 


Hxs 


Ala 


Ala 
460 


Pro 


Asp 


Gly 


Ser 


Arg 
475 


Phe 


Ser 


Gin 


Pro 


Val 
490 


Glu 


Val 


Arg 


Thr 


Tyr 
505 


He 


Cys 


Ala 


Gin 


Thr 
520 


Glu 


Thr 


Asp 


Phe 


Val 
535 


Val 


Trp 


Asp 


Val 


Asp 
550 


Asp 


Gly 


Ser 


Pro 


He 
565 


Ser 


Arg 


Leu 


Val 


Asn 
580 


Leu 


Ser 


Lys 


Val 


Asp 
595 


Ala 


Ala 


Ser 


Glu 


Tyr 
610 


Ala 


Ala 


Thr 


Pro 


Gly 
625 


Ala 


Gly 


Gly 


Tyr 


Asp 
640 


Val 


Phe 


Ser 


Met 


Asp 
655 


He 


Met 


Asn 


Pro 


Glu 
670 


Lys 


Pro 


Gly 


Gly 


Ser 
685 


Leu 


Lys 


Arg 


Glu 


Pro 
700 


Leu 


His 


Ala 


Pro 
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Val Leu Ser Ser Gly 

10 

Met Glu Leu He Gin 

25 

Leu Cys His Glu Leu 

40 



Lys 


Asp 


Arg 


Gly 


Phe 










3 60 


Leu 


Tyr 


Tyr 


Tyr 


Met 










375 


Asp 


Gin 


Glu 


Phe 


Leu 










390 


Glu 


Gly 


Leu 


Leu 


Asn 










405 


Glu 


Gin 


Met 


Thr 


Asp 










420 


Tyr 


Thr 


Val 


Lys 


Asp 










435 


Tyr 


Leu 


Asp 


Leu 


Tyr 










450 


Cys 


Phe 


Gly 


Leu 


Gin 










465 


Met 


Met 


Ala 


Val 


Ala 










480 


Ala 


Gly 


Arg 


Pro 


Ser 










495 


Phe 


His 


Glu 


Phe 


Gly 










510 


Asp 


Phe 


Ala 


Arg 


Phe 










525 


Glu 


Val 


Pro 


Ser 


Gin 










540 


Ser 


Leu 


Arg 


Arg 


Leu 










555 


Ala 


Asp 


Asp 


Leu 


Leu 










570 


Thr 


Gly 


Leu 


Leu 


Thr 










585 


Gin 


Ser 


Leu 


His 


Thr 










600 


Ala 


Lys 


Tyr 


Cys 


Ser 










615 


Thr 


Asn 


Met 


Pro 


Ala 










63 0 


Gly 


Gin 


Tyr 


Tyr 


Gly 










645 


Met 


Phe 


Tyr 


Ser 


Cys 










660 


Val 


Glv 


Met 


Lvs 


Tvx 










675 


Leu 


Asp 


Gly 


Met 


Asp 










690 


Asn 


Gin 


Lys 


Ala 


Phe 








705 


Leu 


Ser 


Gly 


Gly 


Ala 










15 


Pro 


Lys 


Glu 


Pro 


Thr 








30 


His 


Thr 


Leu 


Phe 


Gin 
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va.x 


weu 


Trp 


Ser 


Gly 

3 u 




XliS 


Ser 


Val 


Trp 
65 






Asp Ala 


Gin 










80 


Gin 


Arg 


Glu 


Leu 


Glu 
95 


pro 


Thx 


Ser 


Gin 


Arg 

110 






Asn 


He 


Phe 


His 
125 


Ala 


Cys 


Asp Asn 


Lys 










140 


_ 

ser 


— - — 

Leu 


Glu 


Phe 


Pro 
155 


Ala 


Ser 


Gin 


Pro 


Cys 
170 


Glu 


Thr 


Glu 


Ala 


Leu 
185 


Asn 


Ser 


Lys 


Arg 


Arg 
200 


Glu 


Ala 


Gin 


Lys 


Gin 
215 


Arg 


Leu 


His 


Leu 


Lys 
230 


Glu 


Lys 


He 


Gly 


Val 
245 


Glu 


Pro 


Tyr 


Cys 


Cys 
260 


Cys 


Phe 


He 


Tyr 


Asp 
275 


Gly 


Phe 


Gly 


Ser 


Gly 
290 


Gly Gly 


Phe 


Trp 


Val 










305 


Hir 


Met 


Asp 


Glu 


Val 
320 


Thr 


Gin 


Arg Val 


Thr 










335 


Glu 


Leu 


Leu 


Leu 


Gly 
350 


Ser 


Ser 


Asn 


Glu 


He 
3 65 



Lys 


Trp 


Ala 


Leu 


Val 










55 


Arg 


Leu 


He 


Pro 


Ala 










•7 f\ 

1 0 


Glu 


Pne 


Leu 


Cys 


Glu 










85 


Thr 


Thr 


Gly Thr 


Ser 










100 


Lys 


Leu 


He 


Lys 


Gin 










115 


Gly 


Gin 


Leu 


Leu 


Ser 










13 0 


Ser 


Asn 


Thr 


He 


Glu 










145 


Glu 


Arg 


Tyr 


Gin 


Cys 










160 


Leu 


Val 


Thr 


Glu 


Met 










175 


Glu 


Gly 


Lys 


He 


Tyr 










190 


Arg 


Phe 


Ser 


Ser 


Lys 










205 


Leu 


Met 


He 


Cys 


His 










220 


Arg 


Phe 


Arg 


Trp 


Ser 










235 


His 


Val 


Gly Phe Glu 










250 


Arg 


Glu 


Thr 


Leu 


Lys 










265 


Leu 


Ser 


Ala 


Val 


Val 










280 


His 


Tyr 


Thr 


Ala 


Tyx 










295 


His 


Cys 


Asn Asp 


Ser 










310 


Cys 


Lys 


Ala 


Gin 


Ala 










325 


Glu 


Asn 


Gly His 


Ser 










340 


Ser 


Gin 


His 


Pro 


Asn 










355 


Leu 


Ser 









«^ 

ser 


Pro 


Pne 


Ala 


Met 
60 


Phe 


Arg 


Gly 


Tyr 


Ala 
75 


Leu 


Leu 


Asp 


Lys 


He 
90 


Leu 


Pro 


Ala 


Leu 


He 
105 


Val 


Leu 


Asn 


Val 


Val 
120 


Gin 


Val 


Thr 


Cys 


Leu 
135 


Pro 


Phe 


Trp 


Asp 


Leu 
150 


Ser 


Gly 


Lys 


Asp 


He 
165 


Leu 


Ala 


Lys 


Phe 


Thr 
180 


Val 


Cys 


Asp 


Gin 


Cys 
195 


Pro 


Val 


Val 


Leu 


Thr 
210 


Leu 


Pro 


Gin 


Val 


Leu 
225 


Gly Arg Asn 


Asn 


Arg 










240 


Glu 


He 


Leu 


Asn 


Met 
255 


Ser 


Leu 


Arg 


Pro 


Glu 
270 


Met 


His 


His 


Gly 


Lys 
285 


Cys 


Tyr 


Asn 


Ser 


Glu 
300 


Lys 


Leu 


Ser 


Met 


Cys 
315 


Tyr 


He 


Leu 


Phe 


Tyr 
330 


Lys 


Leu 


Leu 


Pro 


Pro 
345 


Glu Asp 


Ala 


Asp 


Thr 










360 
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<400> 6 



Met 


Lys 


Tyr 


Val 


Phe 


Tyr 


Leu 


Gly Val 


Leu 


Ala Gly Thr 


Phe 


Phe 


1 








5 










10 










15 


Phe 


Ala 


Asp 


Ser 


Ser 
20 


Val 


Gin 


Lys 


Glu 


Asp 
25 


Pro 


Ala 


Pro 


Tyr 


Leu 
30 


Val 


Tyr 


Leu 


Lys 


Ser 


His 


Phe 


Asn 


Pro 


Cys 


Val 


Gly 


Val 


Leu 


He 










35 










40 








45 


Lys 


Pro 


Ser 


Trp 


Val 


Leu 


Ala 


Pro 


Ala 


His 


Cys 


Tyr 


Leu 


Pro 


Asn 










50 










55 






60 


Leu 


Lys 


Val 


Met 


Leu 


Gly Asn 


Phe 


Lys 


Ser 


Arg 


Val 


Arg 


Asp 


Gly 










65 










70 










75 


Thr 


Glu 


Gin 


Thr 


He 
80 


Asn 


Pro 


He 


Gin 


He 
85 


Val 


Arg 


Tyr 


Trp 


Asn 
90 
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Tyr 


Ser 


His 


Ser 


A J. a 


JtTO 


VaXn 


ASp 


Asp ijeu 


Met 


Leu 


xle 




Leu 


















1 nn 
xuu 








105 


Ala 


Lys 


Pro 


Ala 


1,. 


ijeu 


Asn 


irJCO 


jjys vax 


Gin 


Pro 


Leu 


Thr 


Leu 










1 1 n 








lie 
XX3 










120 


Ala 


Thr 


Thr 


Asn 


vax 


Arg 


XTiO 


uxy 


^^^^ T 1 

xnr vax 


cys 


Leu 


Leu 


Ser 


Gly 


















x^u 










135 


Leu 


Asp 


Trp 


Ser 




(aXU 


Asn 


ser 


Gxy Arg 


His 


Pro 


Asp 


Leu 


Arg 










1 An 








X43 








150 


Gin 


Asn 


Leu 


Glu 


AjLa 


Awn 

Pro 


vax 


Met 


Ser Asp 


Arg Glu 


Cys 


Gin 


Lys 


















160 










165 




Glu 


Gin 


Gly 


1^ — — 

Lys 


Ser 


His 


Arg 


Asn Ser 


Leu 


Cys 


Val 


Lys 


Phe 










170 








175 






180 


Val 


Lys 


val 


Phe 


Ser 


Arg 


lie 


Phe 


Gly Glu 


Val 


Ala 


Val 


Ala 


Thr 










185 








190 










195 


Val 


He 


Cys 


Lys 


Asp 


Lys 


Leu 


Gin 


Gly He 


Glu 


Val 


Gly His 


Phe 










200 








205 










210 


Met 


Gly Gly 


Asp 


Val 


Gly 


He 


Tyr 


Thr Asn 


Val 


Tyr 


Lys 


Tyr 


Val 










215 








220 








225 


Ser 


Trp 


He 


Glu 


Asn 


Thr 


Ala 


Lys 


Asp Lys 




















230 








235 
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Met 


Gin 


Pro Thr 


Gly 


Arg 


Glu 


1 






5 






Tyr 


Leu 


Arg Arg 


Leu 
20 


Leu 


Leu 


Gin 


Pro 


Val Thr 


Arg 
35 


Ala 


Glu 


Leu 


Ser 


Thr Leu 


Gly 
50 


Ser 


Pro 


Pro 


Ser 


Ala Leu 


Thr 
65 


Thr 


Pro 


Lys 


Thr 


Leu Asp 


Leu 
80 


Arg 


Gly 


Phe 


Pro 


Leu Val 


Asp 
95 


Gly 


His 


Gin 


Arg 


Tyr Lys 


Asn 
110 


Val 


Leu 


Ser 


His 


Gly Gin 


Thr 
125 


Ser 


Leu 


Gly 


Ala 


Gin Phe 


Trp 
140 


Ser 


Ala 


Gin 


Thr 


Ala Val 


Arg 
155 


Leu 


Ala 


Arg 


Met 


Cys Ala 


Ser 
170 


Tyr 


Ser 


Glu 


Gly 


Leu Asn 


Ser 
185 


Ser 


Gin 


Glu 


Gly 


Gly His 


Ser 
200 


Leu 


Asp 


Phe 


Tyr 


Val Leu 


Gly 
215 


Val 


Arg 


Ser 


Thr 


Pro Trp 


Ala 
230 


Glu 


Ser 


Tyr 


Thr 


Asn Val 


Ser 


Gly 


Leu 








245 




Glu 


Glu 


Leu Asn 


Arg 


Leu 


Gly 








260 





Gly 


Ser 


Arg 
10 


Ala 


Leu 


Ser 


Arg Arg 
15 


Leu 


Leu 


Leu 
25 


Leu 


Leu 


Leu 


Leu Arg 
30 


Thr 


Thr 


Pro 
40 


Gly 


Ala 


Pro 


Arcr Ala 
45 


Ser 


Leu 


Phe 
55 


Thr 


Thr 


Pro 


Gly Val 
60 


Gly 


Leu 


Thr 
70 


Thr 


Pro 


Gly 


Thr Pro 
75 


Arg 


Ala 


Gin 
85 


Ala 


Leu 


Met 


Arg Ser 
90 


Asn 


Asp 


Leu 
100 


Pro 


Gin 


Val 


Leu Arg 
105 


Gin 


Asp 


Val 
115 


Asn 


Leu 


Arg 


Asn Phe 
120 


Asp Arg 


Leu Arg 


Asp 


Gly Leu Val 






130 








135 


Ser 


Val 


Ser 
145 


cys 


Gin 


Ser 


Gin Asp 
150 


Leu 


Glu 


Gin 
160 


He 


Asp 


Leu 


He His 
165 


Glu 


Leu 


Glu 
175 


Leu 


Val 


Thr 


Ser Ala 
180 


Lys 


Leu 


Ala 
190 


Cys 


Leu 


He 


Gly Val 
195 


Ser 


Ser 


Leu 
205 


Ser 


Val 


Leu 


Arg Ser 
210 


Tyr 


Leu 


Thr 
220 


Leu 


Thr 


Phe 


Thr Cys 
225 


Ser 


Thr 


Lys 
235 


Phe 


Arg 


His 


His Met 
240 


Thr 


Ser 


Phe 
250 


Gly 


Glu 


Lys 


Val Val 
255 


Met 


Met 


He 
265 


Asp 


Leu 


Ser 


Tyr Ala 
270 
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It — ■ 
Asp 


Tlir 


Leu 


He 
275 


val 


He 


Pne 


Ser 


His 
290 


Leu 


- Asn 


Val 


— 

F>ro 


Asp 
305 


Gly 


He 


Val 


Met: 


Val 
320 


Leu 


Leu 


Ala 


Asn 


Val 
335 


Arg 


Ala 


Val 


He 


Gly 
350 


Asp 


Gly 


TJhr 


Gly 


Arg 
3 65 


Tyr 


Pro 


Val 


Leu 


He 
380 


Glu 


Glu 


Leu 


Gin 


Gly 
395 


Arg 


Gin 


Val 


Glu 


Lys 
410 


Val 


Glu 


Ala 


Glu 


Phe 
425 


Ser 


His 


Leu 


Val 


Pro 
440 


Val 


Thr 


Lys 


Gin 


Pro 
455 


Ala 


Ser 


Pro 


Tyr 


Leu 
470 


Pro 


Thr 


Phe 


Thr 


Gin 
485 
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<400> 8 



Met 


Leu 


Leu 


Gly 


Arg 


1 








5 


Val 


Pro 


Lys 


Lys 


Ala 
20 


Gly 


Ser 


Ala 


Val 


Gly 
35 


Ser 


Pile 


Asp 


Leu 


Gly 
50 


Gly 


Arg 


He 


Val 


Gly 
65 


Trp 


Gin 


Ala 


Ser 


Leu 
80 


Ser 


Leu 


Leu 


Ser 


Pro 
95 


Ser 


Gly 


Ser 


Leu 


Asn 
110 


Leu 


Glu 


He 


Thr- 


Leu 
125 


He 


Leu 


His 


Ser 


Ser 
140 


He 


Ala 


Leu 


Val 


Glu 
155 


He 


Leu 


Pro 


Val 


Cys 
170 


Gly 


He 


Arg 


Cys 


Trp 
185 



Arg 


Arg 


Val Leu 


Glu 








280 


Ser 


Ala 


Ala Arg Ala 








295 


Asp 


He 


Leu Gin 


Leu 








310 


Ttir 


Leu 


Ser Met 


Gly 








325 


Ser 


Tlir 


Val Ala 


Asp 








340 


Ser 


Glu 


Phe He 


Gly 








355 


Ptie 


Pro 


Gin Gly Leu 








370 


Glu 


Glu 


Leu Leu 


Ser 








385 


Val 


Leu 


Arg Gly Asn 








400 


Val 


Arg 


Glu Glu 


Ser 








415 


Pro 


Tyr 


Gly Gin 


Leu 








430 


Gin 


Asn 


Gly His 


Gin 








445 


Thr 


Asn 


Arg Val 


Pro 








460 


Val 


Pro 


Gly Leu 


Val 








475 


Trp 


Leu 


Cys 
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Val Trp Gin Thr Arg 

10 

Gly Arg Cys Gly Gin 

25 

Phe Leu Gly Ser Pro 

40 

Cys Gly Arg Pro Gin 

55 

Gly His Ala Ala Pro 

70 

Arg Leu Arg Arg Val 

85 

Gin Trp Val Leu Thr 

100 

Ser Ser Asp Tyr Gin 

115 

Ser Pro His Phe Ser 

13 0 

Pro Ser Gly Gin Pro 

145 

Leu Ser Val Pro Val 

160 

Leu Pro Glu Ala Ser 

175 

Val Thr Gly Trp Gly 

190 



Val Ser Gin Ala Pro 

285 

Val Cys Asp Asn Leu 

300 

Leu Lys Lys Asn Gly 

315 

Val Leu Gin Cys Asn 

330 

His Phe Asp His He 

345 

He Gly Gly Asn Tyr 

360 

Glu Asp Val Ser Thr 

375 

Arg Ser Trp Ser Glu 

390 

Leu Leu Arg Val Phe 

405 

Arg Ala Gin Ser Pro 

420 

Ser Thr Ser Cys His 

435 

Ala Thr His Leu Glu 

450 

Trp Arg Ser Ser Asn 

465 

Ala Ala Ala Thr He 

480 



Glu 


Leu 


Lys 


Ser 


Lys 
15 


Gly Arg 


Leu 


His 


Gly 










30 


Pro 


Gly Thr 


Pro 


Ser 










45 


Val 


Ser 


Asp 


Ala 


Gly 
60 


Ala 


Gly Ala 


Trp 


Pro 










75 


His 


Val 


cys 


Gly 


Gly 
90 


Ala 


Ala 


His 


Cys 


Phe 
105 


Val 


His 


Leu Gly 


Glu 










120 


Thr 


Val 


Arg Gin. 


He 










135 


Gly 


Thr 


Ser Gly 


Asp 










150 


Thr 


Leu 


Phe 


Ser 


Arg 
165 


Asp 


Asp 


Phe 


Cys 


Pro 
180 


Tyr 


Thr 


Arg 


Glu 


Gly 
195 
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01 tl 


irro Lieu 


^^^^ 


^^^^^^ 

irro 
200 


pro 


Tyr 


Ser Leu 




Vd.X ASp 


xnr 


vsxU 

215 


xnr 


KjyS 


Arg Arg 




oer ±±e 




230 


Pro 


Asp 


Met Leu 




A J. a vjys 




ASp 
0 >1 c 


Asp 


Ser 


Gly Gly 


ASH 


vaJLy Ai.a. 


Trp 


Val 


Gin 


Ala 


Gly He 








0 c rt 






Cys 


Gly Arg 


Pro 


Asn 

275 


Arg 


Pro 


Gly Val 


Tyr 


Val Asn 


Trp 


He 
290 


Arg 


Arg 


His He 


Glu 


Ser Gly 


Tyr 


Pro 
3 05 


Arg 


Leu 


Pro Leu 


Pro 


Gly Leu 


Phe 


Leu 
320 


Leu 


Leu 


Val Ser 


Cys 


Leu Leu 


His 


Pro 


Ser 


Ala 


Asp Gly 








335 






Asp 















PCT/USOl/19178 



Arg 


Glu 


Val 


Lys Val 


Ser 


one 






210 


Asp 


Tyr 


Pro 


Gly Pro 


Gly 


0 0 rt 








225 


Cys 


Ala Arg 


Gly Pro 


Gly 


0 FT 

235 








240 


Pro 


Leu 


Val 


Cys Gin 


Val 


250 








255 


Val 


Ser 


Trp 


Gly Glu Gly 


265 








270 


Tyr 


Thr 


Arg 


Val Pro 


Ala 


280 








285 


Thr 


Ala 


Ser 


Gly Gly 


Ser 


295 








300 


Leu 


Ala 


Gly 


Leu Phe 


Leu 


310 








315 


Cys 


Val 


Leu 


Leu Ala 


Lys 


325 








330 


Thr 


Pro 


Phe 


Pro Ala 


Pro 


340 








345 
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<400> 9 



Met 


Ala 


Ala 


Ala 


Met 


Glu 


Thr 


Glu 


Gin 


Leu 


Gly Val 


Glu 


He 


Phe 


1 








5 










10 










15 


Glu 


Thr 


Ala 


Asp 


Cys 


Glu 


Glu 


Asn 


He 


Glu 


Ser 


Gin 


Asp 


Arg 


Pro 










20 










25 






30 


Lys 


Leu 


Glu 


Pro 


Phe 


oyr 


Val 


Glu Arg 


Tyr 


Ser 


Trp 


Ser 


Gin 


Leu 










35 










40 








45 


Lys 


Lys 


Leu 


Leu 


Ala 


Asp 


Thr 


Arg 


Lys 


Tyr 


His 


Gly 


Tyr 


Met 


Met 










50 










55 








60 


Ala 


Lys 


Ala 


Pro 


His 


Asp 


Phe 


Met 


Phe 


Val 


Lys 


Arg 


Asn 


Asp 


Pro 










65 










70 








75 


Asp 


Gly Pro 


His 


Ser 


Asp 


Arg 


He 


Tyr 


Tyr 


Leu 


Ala 


Met 


Ser 


Gly 










80 










85 










90 


Glu 


Asn 


Arg 


Glu 


Asn 


Thr 


Leu 


Phe 


Tyr 


Ser 


Glu 


He 


Pro 


Lys 


Thr 










95 










100 








105 


He 


Asn Arg 


Ala 


Ala 


Val 


Leu 


Met 


Leu 


Ser 


Trp 


Lys 


Pro 


Leu 


Leu 










110 










115 








120 


Asp 


Leu 


Phe 


Gin 


Ala 


Thr 


Leu 


Asp 


Tyr 


Gly Met 


Tyr 


Ser 


Arg 


Glu 










125 










130 






135 


Glu 


Glu 


Leu 


Leu 


Arg 


Glu 


Arg 


Lys 


Arg 


He 


Gly Thr Val 


Gly 


He 










140 










145 








150 


Ala 


Ser 


Tyr 


Asp 


Tyr 
155 


His 


Gin 


Gly 


Ser 


Gly 
160 


Thr 


Phe 


Leu 


Phe 


Gin 
165 


Ala 


Gly 


Ser 


Gly 


He 


Tyr 


His 


Val 


Lys 


Asp 


Gly Gly Pro 


Gin 


Gly 










170 










175 










180 


Phe 


Thr 


Gin 


Gin 


Pro 
185 


Leu 


Arg 


Pro 


Asn 


Leu 
190 


Val 


Glu 


Thr 


Ser 


Cys 
195 


Pro 


Asn 


He 


Arg 


Met 


Asp 


Pro 


Lys 


Leu 


cys 


Pro 


Ala Asp 


Pro 


Asp 










200 










205 










210 


Trp 


He 


Ala 


Phe 


He 
215 


His 


Ser 


Asn 


Asp 


He 
220 


Trp 


He 


Ser 


Asn 


He 
225 


Val 


Thr 


Arg 


Glu 


Glu 


Arg Arg 


Leu 


Thr 


Tyr 


Val 


His 


Asn 


Glu 


Leu 










230 










235 










240 


Ala 


Asn 


Met 


Glu 


Glu 


Asp 


Ala 


Arg 


Ser 


Ala 


Gly Val 


Ala 


Thr 


Phe 










245 










250 










255 
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Val 


Leu 


Gin 


Glu 


Glu 


Phe Asp Arg 


Tyr 


Ser 


«xy xyr 


Trp 


Trp 


cys 










260 










265 






Pro 


Lvs 


Ala 


Glu 


Thr 


Thr 


Pro 


Ser 


Gly Gly 


Lys He 


Leu 


Arg 


lie 










275 










O O 








Leu. 


Tvr 


Glu 


Glu 


Asn 


Asp 


Glu 


Ser 


Glu 


vai 


Glu He 


He 


His 


val 










290 










o o t 

295 








J uu 




Ser 


-Xi w 




Leu 


Glu 


Thr Arg 


Arg 


Ala 


Asp Ser 


Phe Arg 


Tyr 










305 










310 








lie 

^ lb 


Pro 


jr 9 




v» JLjr 


Thr 


Ala 


Asn 


Pro 


l»ys 


Val 


Thr Phe 


Lys 


Met 


Ser 










320 










O FT 

325 






^ n 


Glu 


Tie 




T1 o 


Asp Ala Glu Gly Arg 


He 


He Asp 


Val 


He 


Asp 










335 










340 










JJtV s 






Tl a 


Gin 


Pro 


Phe 


Glu 


He 


Leu 


Phe Glu Gly Val 


Glu 










350" 










^ ^ ^ 

355 








3 60 


'PVTT* 


Tic* 


A -La 


Arg 


Ala 


Gly Trp 


Thr 


Pro 


Glu 


Gly Lys 


Tyr 


Ala 


Trp 










365 










370 








*a T C 

37b 




Tl o 




ueu. 


Asp 


Arg 


Ser 


Gin 


Thr 


Arg 


Leu Gin 


He 


Val 


Leu 










380 










385 








"y n n 

390 


T1 o 

x±e 




rTO 




Leu 


Phe 


He 


Pro 


Val 


Glu 


Asp Asp 


Val 


Met 


Glu 










395 










400 








405 


Arg 


Gin 


^ ^^^^ 

Arg 


Leu 


He 


Glu 


Ser 


Val 


Pro 


Asp 


Ser Val 


Thr 


Pro 


Leu 










410 










415 








420 


He 


He 


Tyr 


111 

GXU 


Glu 


Thr 


Thr 


Asp 


He 


Trp 


He Asn 


He 


His 


Asp 










425 










43 0 








435 


He 


Phe 




Val 


Phe 


Pro 


Gin 


Ser 


His 


Glu 


Glu Glu 


He 


Glu 


Phe 










440 










445 








450 


He 


Phe 


Axa 


Ser 


Glu 


Cys 


Lys 


Thr 


Gly 


Phe 


Arg His 
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Lys 


Gly 


Ser 


Leu 










755 






Ala 


Val 


Glu 


Ser 


Leu 


Gin 


Ala 










770 






Thr 


Val 


Glu 


Val 


Leu 


Ser 


Val 










785 






Arg 


Tyr 


Ser 


Phe 


Tyr 


Leu 


Pro 










800 






Ser 


His 


Pro 


Pro 


His 


Pro 


Arg 










815 




Ser 


Val 


Leu 


Ser 


Leu 


Ser 


Asn 










83 0 






Pro 


Pro 


Ala Arg 


Trp 


Val 


Ala 










845 






Ser 


Cys 


Gly 


Ser 


Gly 


Leu 


Gin 










860 






Ser 


Ala 


Gly Gin 


Arg 


Thr 


Val 










875 






Pro 


Val 


Glu 


Thr 


Gin 


Ala 


Cys 










890 




Leu 


Ser 


Ala 


Trp 


Ser 


Pro 


Cys 










905 






Gin 


Arg Arg 


Ser 


Leu 


Lys 


Cys 










920 






Ala 


Arg Asp 


Gin 


Cys 


Asn 


Leu 










935 






Phe 


Cys 


Val 


Leu 


Arg 


Pro 


Cys 










950 
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520 








525 


Gly Gly Gly Val Gin Leu Ala 


Arg 






535 








c A n 

540 


^^^^ 

Jrro 


Ala 


Asn 


Gly Gly Lys 


Tyr 


Cys 






550 








555 


Arg 


Ser 


Cys 


Asn Leu 


Glu 


Pro 


Cys 






565 








con 

570 


Ser 


Phe 


Arg 


Glu Glu 


Gin 


Cys 


Glu 






580 








o n* 

585 


Ser 


Thr 


Asn 


Arg Leu 


Thr 


Leu 


Ala 






595 








600 


Ser 


Gly Val 


Ser Pro 


Arg 


Asp 


Lys 






610 








615 


Asn 


Gly 


Thr 


Gly Tyr 


Phe 


Tyr 


Val 






625 








630 


Asp 


Gly 


Thr 


Leu Cys 


Ser 


Pro 


Asp 






640 








645 


Gly Lys 


Cys 


He Lys 


Ala 


Gly 


Cys 






655 








660 


Lys 


Arg 


Phe 


Asp Lys 


cys 


Gly 


Val 






670 








675 


Cys 


Lys 


Lys 


Val Thr 


Gly 


Leu 


Phe 






685 








690 


Asn 


Phe 


Val 


Val Ala 


He 


Pro 


Ala 






700 








705 


Arg 


Gin 


Arg 


Gly Tyr 


Lys 


Gly 


Leu 






715 








720 


Ala 


Leu 


Lys 


Asn Ser 


Gin 


Gly 


Lys 






730 








735 


Val 


Val 


Ser 


Ala Val 


Glu 


Arg 


Asp 






745 








750 


Leu 


Arg 


'Tyr 


Ser Gly Thr 


Gly 


Thr 






760 








765 


Ser 


Arg 


Pro 


He Leu 


Glu 


Pro 


Leu 






775 








780 


Gly Lys 


Met 


Thr Pro 


Pro 


Arg 


Val 






790 








795 


Lys 


Glu 


Pro 


Arg Glu 


Asp 


Lys 


Ser 






805 








810 


Gly Gly 


Pro 


ser Val 


Leu 


His 


Asn 






820 








825 


Gin 


Val 


Glu 


Gin Pro 


Asp 


Asp 


Arg 






835 








840 


Gly 


Ser 


Trp Gly Pro 


cys 


Ser 


Ala 






850 








855 


Lys 


Arg 


Ala 


val Asp 


Trp 


Arg 


Gly 






865 








870 


Pro 


Ala 


Cys 


Asp Ala Ala 


His 


Arg 






880 








885 


Gly Glu 


Pro 


Cys Pro 


Thr 


Trp 


Glu 






895 








900 


Ser 


Lys 


Ser 


Cys Gly Arg 


Gly 


Phe 






910 






915 


Val 


Gly 


His 


Gly Gly Arg 


Leu 


Leu 






925 








930 


His 


Arg 


Lys 


Pro Gin 


Glu 


Leu 


Asp 






940 








945 



15/45 



wo 01/98468 



PCT/USOl/19178 



<223> Incyte ID 


IMQ . 


<400> 12 






Met 


Glu 


Asn 


Trp 


Tnr 


1 






b 


Leu 


Ser 


Leu 


Pro 


GJLn 












His 


Ser 


Leu 


Gin 


Tnr 










JO 


Val 


Trp 


Gly 


Pro 


Trp 










50 


Gly Val 


Gly Val 


Gin 












Val 


Gin 


Leu 


His 


Pro 










80 


His 


Pro 


Glu 


Ala 


Leu 










95 


Thr 


Ser 


Pro 


Glu 


Thr 










110 


Arg 


Gly 


Gly 


Pro 


Leu 










125 


Glu 


Thr 


Gin 


Glu 


lie 










140 


Pro 


lie 


Lys 


Pro 


Gly 










155 


Leu 


Pro 


Leu 


His 


Arg 










170 


Ser 


Glu 


Leu 


Ser 


Leu 










185 


Ser 


Pro 


Thr 


Pro 


Arg 










200 


Gin 


Thr 


Glu 


Leu 


Pro 










215 


Pro 


Gin 


Ala 


Glu 


Pro 










230 


Ala 


Pro 


Arg 


Thr 


Arg 










245 


Gin 


Ala 


Ser 


Gly 


Thr 










260 


Glu 


Gly 


Gly 


Phe 


Phe 










275 


Ser 


Gin 


Gly Trp 


Ala 










290 


Pro 


Phe 


Pro 


Ser 


Val 










3 05 


Pro 


Trp 


Gly Thr 


Gly 










320 


Asp 


Pro 


Gin 


His 


Pro 










O JO 


Pro 


His 


Ala 


Ser 


Ser 










J 3 VJ 


lie 


Pro 


Arg 


Cys 


ser 












Gin 


Ala 


Pro 


Cys 


irro 










J o u 


Cys 


Ala 


Ala 


Phe 


Asn 










395 


Trp 


Glu 


Pro 


Phe 


Thr 










410 


Asn 


Cys 


Arg 


Pro 


Arg 










425 


Lys 


Val 


Gin Asp 


Gly 










440 


Cys 


Val 


Ala 


Gly 


Arg 










455 


Gly 


Ser 


Gly Arg 


Arg 










470 



7604035CD1 



Gly 


Arg 


Pro 


Trp 


Leu 

10 


Leu 


Cys 


Leu 


Asp 


Gin 
25 


Pro 


Thr 


Glu 


Glu 


Gly 
40 


Val 


Gin 


Trp 


Ala 


Ser 
55 


Arg 


Arg 


Ser 


Arg 


Thr 
70 


Ser 


Leu 


Pro 


Leu 


Pro 
85 


Leu 


Pro 


Arg 


Gly 


Gin 
100 


Leu 


Pro 


Leu 


Tyr 


Arg 
115 


Arg 


Gly 


Pro 


Ala 


Ser 
130 


Arg 


Ala 


Ala 


Arg 


Arg 
145 


Met 


Phe 


Gly 


Tyr 


Gly 
160 


Asn 


Arg 


Arg 


His 


Pro 
175 


lie 


Ser 


Ser 


Arg 


Gly 
190 


Ala 


Glu 


Pro 


Phe 


Ser 
205 


Pro 


Thr 


Glu 


Leu 


Ser 
220 


Leu 


Ser 


Pro 


Glu 


Thr 
235 


Pro 


Ala 


Pro 


Leu 


Arg 
250 


Glu 


Pro 


Pro 


Ser 


Pro 
265 


Arg 


Ala 


Ser 


Pro 


Gin 
280 


Ser 


Pro 


Gin 


Val 


Ala 
295 


Pro 


Arg 


Gly 


Arg 


Gly 
310 


Gly 


Thr 


Pro 


His 


Gly 
325 


Gly 


Ala 


Trp 


Leu 


Pro 

J4U 


Leu 


Trp 


Ser 


Leu 


Phe 

o c c 


Gly 


Glu 


Ser 


Glu 


Gin 
370 


Pro 


Glu 


Gin 


Pro 


Asp 
385 


Ser 


Gin 


Glu 


Phe 


Met 
400 


Glu 


Val 


Gin 


Gly 


Ser 








415 


Gly 


Phe 


Arg 


Phe 


Tyr 
430 


Thr 


Leu 


Cys 


Gin 


Pro 
445 


Cys 


Leu 


Ser 


Pro 


Gly 
460 


Pro 


Asp 


Gly 


Cys 


Gly 
475 



Tyr 


Leu 


Leu 


Leu 


Leu 
15 


Glu 


Val 


Leu 


Ser 


Gly 
30 


Gin 


Gly 


Pro 


Glu 


Gly 
45 


Cys 


Ser 


Gin 


Pro 


Cys 
60 


Cys 


Gin 


Leu 


Pro 


Thr 
75 


Pro 


Arg 


Pro 


Pro 


Arg 
90 


Gly 


Pro 


Arg 


Pro 


Gin 
105 


Thr 


Gin 


Ser 


Arg 


Gly 
120 


His 


Leu 


Gly 


Arg 


Glu 
135 


Ser 


Arg 


Leu 


Arg 


Asp 
150 


Arg 


Val 


Pro 


Phe 


Ala 
165 


Arg 


Ser 


Pro 


Pro 


Arg 
180 


Glu 


Glu 


Pro 


lie 


Pro 
195 


Ala 


Asn 


Gly 


Ser 


Pro 
210 


Val 


His 


Thr 


Pro 


Ser 
225 


Ala 


Gin 


Thr 


Glu 


Val 
240 


His 


His 


Pro 


Arg 


Ala 
255 


Thr 


His 


Ser 


Leu 


Gly 
270 


Pro 


Arg 


Arg 


Pro 


Ser 
285 


Gly 


Arg 


Arg 


Pro 


Asp 
300 


Gin 


Gin 


Gly 


Gin 


Gly 
315 


Pro 


Arg 


Leu 


Glu 


Pro 
330 


Leu 


Leu 


Ser 


Asn 


Gly 
345 


Ala 


Pro 


Ser 


Ser 


Pro 
360 


Leu 


Arg 


Ala 


Cys 


Ser 
375 


Pro 


Arg 


Ala 


Leu 


Gin 
390 


Gly 


Gin 


Leu 


Tyr 


Gin 
405 


Gin 


Arg 


Cys 


Glu 


Leu 
420 


Val 


Arg 


His 


Thr 


Glu 
435 


Gly 


Ala 


Pro 


Asp 


He 
450 


Cys 


Asp 


Gly 


lie 


Leu 
465 


Val 


Cys 


Gly 


Gly 


Asp 
480 
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Asia 


S&ir 


Ttir 






Leu 


Val ser 


Glv 










485 








Glv 


Pro 




Glv 


Tvr 


Gin 


Lys lie 












500 














Gin 


lie 


Ala 


Gin Leu 












515 








Ala 




nx y 


Glv 


f ^ w 


Gly Gly Arg 




















Ala 


Val 

VCIJL 






£^X \J 


Gly 


Ser- Tyr 


<nx y 




















lyr 


A en 


jnx ^ 


JrX O 


Pro 


Arg Glu 


uXU 










^ o u 










Ala 


\7XLL 




f^XO 


Thr 


Thr Gin 


jrXO 




















(.9 XXI 


VaXU 


V^XU 


ASH 


Pro 


Gly Val 


T>Vt A 

irne 










J J \j 








IrXO 




Jrx O 


xxe 


Leu 


Glu Asn 


irro 


















rTO 


CvXil 


Lieu. 


V9Xll 


irro 


Glu 


lie Leu 


Arg 










/ton 










^^^^^^ 

fro 


Arg 


irro 


Ala Arg Thr 


Pro 


















va± 


Arg 


lie 


Jrro 


vjyin 


Met 


Pro Ala 


pro 










OD U 








Lieu 


Giy 


ser 


pro 


Ala 


Ala 


Tyr Trp 


Lys 










DOD 








CyS 


Ser 


Ala. 


Ser 


Cys 


Gly 


Lys Gly 


Val 










con 








Cys 


X le 


Ser 


Arg 


Glu 


Ser 


Gly Glu 


Glu 










<^ Q C 








Ala, 


AXa. 


Gly 


Ala 


Arg 


Pro 


Pro Ala 


Ser 










Tin 








P^^L« -i_n 

Tnar 


Pro 


cys 


Pro 


Pro 


Tyr 


Trp Glu 


Ala 


















Sex 


Arg 


Ser 


Cys 


Gly 


Pro 


Gly Thr 


Gin 










•7 >i n 








Arg 




GIU 


Pne 


Gly 


Gly Gly Gly 


ser 










/ 33 








cys 


Gly 


HIS 


lieu. 


pro 


Arg 


Pro Asn 


lie 










770 








Arg 


lieu. 


trys 


Gly 


His 


Trp 


Glu Val 


Giy 










785 










Va.X 


Arg 


Cys 


Gly 


Arg Gly Gin 


Arg 










800 












Asn 


Asn 


Gly 


Asp 


Glu Val 


ser 










815 








Gly 


Pro 


Pro 


Gin 


Pro 


Pro 


Ser Arg 


Glu 










830 








Cys 


Uir 


Thr 


Ala 


Trp 


Phe 


His Ser 


Asp 










845 








Ala 


Glu 


Cys 


Gly Thr 


Gly 


lie Gin 


Arg 










860 








Gly 


Ser 


Gly Ala Ala 


Thr 


Arg Ala 


Arg 










875 








Arg 


Asn 


Trp 


Ala 


Glu 


Leu 


Ser Asn 


Arg 










890 
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Asn 


Leu 


Thr 


Asp 


Arg 


Gly 


490 










495 


Trp 


He 


Pro 


Ala 


Gly 


Ala 


505 










510 


Pro 


Ser 


Ser 


Asn 


Tyr 


Leu 


520 








525 


lie 


He 


Asn 


Gly 


Asn 


Trp 


535 










540 


Ala 


Gly 


Gly 


Thr 


Val 


Phe 


550 










555 


Gly 


Lys 


Gly 


Glu 


Ser 


Leu 


565 










570 


Val 


Asp 


Val 


Ty^ 


Met 


He 


580 










585 


Tyr 


Gin 


Tyr 


Val 


He 


Ser 


595 










600 


Thr 


Pro 


Glu 


Pro 


Pro 


Val 


610 










615 


Val 


Glu 


Pro 


Pro 


Leu 


Ala 


625 










630 


Gly 


Thr 


Leu 


Gin 


Arg 


Gin 


640 










645 


Pro 


His 


Pro 


Arg 


Thr 


Pro 


655 










660 


Arg 


Val 


Gly 


His 


Ser 


Ala 


670 










675 


Trp 


Arg 


Pro 


He 


Phe 


Leu 


685 










690 


Leu 


Asp 


Glu 


Arg 


Ser 


Cys 


700 










705 


Pro 


Glu 


Pro 


cys 


His 


Gly 


715 










720 


Gly 


Glu 


Trp 


Thr 


Ser 


Cys 


730 










735 


His 


Arg 


Gin 


Leu 


Gin 


Cys 


745 










750 


Ser 


Val 


Pro 


Pro 


Glu 


Arg 


760 










765 


Thr 


Gin 


Ser 


cys 


Gin 


Leu 


775 








780 


Ser 


Pro 


Trp 


Ser 


Gin 


Cys 


790 










795 


Ser 


Arg 


Gin 


Val 


Arg 


Cys 


805 










810 


Glu 


Gin 


Glu 


cys 


Ala 


Ser 


820 








825 


Ala 


Cys 


Asp 


Met 


Gly 


Pro 


835 










840 


Trp 


Ser 


Ser 


Lys 


Cys 


Ser 


850 










855 


Arg 


Ser 


Val 


Val 


Cys 


Leu 


865 










870 


Pro 


Gly 


Gly 


Ser 


Arg 


Ser 


880 










885 


Lys 


Pro 


Ala 


Pro 







895 
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Met 


Phe 


Leu 


Leu 


Ala 


1 








5 


Thr 


Tyr 


Val 


Val 


Val 

20 


Glu 


Arg 


Thr 


Ala 


Arg 

35 


Tyr 


Leu 


Thr 


Lys 


He 
50 


Phe 


Leu 


Val 


Lys 


Met 
65 


Leu 


Pro 


His 


Val 


Asp 
80 


Gin 


Ser 


He 


Pro 


Trp 
95 


Arg 


Ala 


Asp 


Glu 


Tyr 
110 


Val 


Tyr 


Leu 


Leu 


Asp 
125 


Glu 


Gly 


Arg 


Val 


Met 
140 


Asp 


Gly 


Thr 


Arg 


Phe 
155 


Gly 


Thr 


His 


Leu 


Ala 
170 


Ala 


Lys 


Gly 


Ala 


Ser 
185 


Gly 


Lys 


Gly 


Thr 


Val 
200 


Arg 


Lys 


Ser 


Gin 


Leu 
215 


Leu 


Pro 


Leu 


Ala 


Gly 
230 


Gin 


Arg 


Leu 


Ala 


Arg 
245 


Asn 


Phe 


Arg 


Asp 


Asp 
260 


Glu 


Val 


He 


Thr 


Val 
275 


Thr 


Leu 


Gly 


Thr 


Leu 








290 


Phe 


Ala 


Pro 


Gly 


Glu 
305 


Thr 


Cys 


Phe 


Val 


Ser 
320 


Val 


Ala 


Gly 


He 


Ala 

335 


Thr 


Leu 


Ala 


Glu 


Leu 
350 


Asp 


Val 


He 


Asn 


Glu 
365 


Thr 


Pro 


Asn 


Leu 


Val 
380 


Gly 


Trp 


Gin 


Leu 


Phe 
395 


Pro 


Thr 


Arg 


Met 


Ala 
410 


Glu 


Leu 


Leu 


Ser 


Cys 
425 


Gly 


Glu 


Arg 


Met 


Glu 
440 


His 


Asn 


Ala 


Phe 


Gly 
455 


Cys 


Leu 


Leu 


Pro 


Gin 
470 


Ala 


Glu 


Ala 


Ser 


Met 
485 


His 


Val 


Leu 


Thr 


Gly 



Trp 


Gly 


Gin 


Asp 


Pro 
10 


Leu 


Lys 


Glu 


Glu 


Thr 
25 


Arg 


Leu 


Gin 


Ala 


Gin 
40 


Leu 


m 

HXS 


Val 


Phe 


His 
55 


Ser 


Gly 


Asp 


Leu 


Leu 
70 


Tyr 


He 


Glu 


Glu 


Asp 
85 


Asn 


Leu 


Glu 


Arg 


He 
100 


Gin 


Pro 


Pro 


Asp 


Gly 
115 


Thr 


Ser 


He 


Gin 


Ser 
130 


Val 


Thr 


Asp 


Phe 


Glu 
145 


His 


Arg 


Gin 


Ala 


Ser 
160 


Gly 


Val 


Val 


Ser 


Gly 
175 


Met 


Arg 


Ser 


Leu 


Arg 
190 


Ser 


Gly 


Thr 


Leu 


He 








205 


Val 


Gin 


Pro 


Val 


Gly 
220 


Gly 


Tyr 


Ser 


Arg 


Val 
235 


Ala 


Gly 


Val 


Val 


Leu 
250 


Ala 


Cys 


Leu 


Tyr 


Ser 
265 


Gly 


Ala 


Thr 


Asn 


Ala 
280 


Gly 


Thr 


Asn 


Phe 


Gly 
295 


Asp 


He 


He 


Gly 


Ala 
310 


Gin 


Ser 


Gly 


Thr 


Ser 
325 


Ala 


Met 


Met 


Leu 


Ser 
340 


Arg 


Gin 


Arg 


Leu 


He 
355 


Ala 


Trp 


Phe 


Pro 


Glu 








370 


Ala 


Ala 


Leu 


Pro 


Pro 
385 


Cys 


Arg- 


Thr 


Val 


Trp 
400 


Thr 


Ala 


He 


Ala 


Arg 

/tic 


Ser 


Ser 


Phe 


Ser 


Arg 
43 0 


Ala 


Gin 


Gly 


Gly 


Lys 
445 


Gly 


Glu 


Gly 


Val 


Tyr 
460 


Ala 


Asn 


cys 


Ser 


Val 
475 


Gly 


Thr 


Arg 


Val 


His 








490 


Cys 


Ser 


Ser 


His 


Trp 



Trp 


Arg 


Leu 


Pro 


Gly 
15 


His 


Leu 


Ser 


Gin 

J, 


Ser 
30 


Ala 


Ala 


Arg 


Arg 


Gly 
45 


Gly 


Leu 


Leu 


Pro 


Gly 
60 


Glu 


Leu 


Ala 


Leu 


Lys 
75 


Ser 


Ser 


Val 


Phe 


Ala 
90 


Thr 


Pro 


Pro 


Arg 


Tyr 
105 


Gly 


Ser 


Leu 


Val 


Glu 








120 


Asp 


His 


Arg 


Glu 


He 
135 


Asn 


Val 


Pro 


Glu 


Glu 
150 


Lys 


Cys 


Asp 


Ser 


His 
165 


Arg 


Asp 


Ala 


Gly 


Val 
180 


Val 


Leu 


Asn 


Cys 


Gin 
195 


Gly 


Leu 


Glu 


Phe 


He 








210 


Pro 


Leu 


Val 


Val 


Leu 
225 


Leu 


Asn 


Ala 


Ala 


Cys 
240 


Val 


Thr 


Ala 


Ala 


Gly 
255 


Pro 


Ala 


Ser 


Ala 


Pro 
270 


Gin 


Asp 


Gin 


Pro 


Val 
285 


Arg 


Cys 


Val 


Asp 


Leu 
300 


Ser 


Ser 


Asp 


Cys 


Ser 
315 


Gin 


Ala 


Ala 


Ala 


His 
330 


Ala 


Glu 


Pro 


Glu 


Leu 
345 


His 


Phe 


Ser 


Ala 


Lys 
360 


Asp 


Gin 


Arg 


Val 


Leu 
375 


Ser 


Thr 


Hxs 


Gly 


Ala 
390 


Ser 


Ala 


His 


Ser 


Gly 
405 


Cys 


Ala 


Pro 


Asp 


Glu 
420 


Ser 


Gly 


Lys 


Arg 


Arg 
435 


Leu 


val 


Cys 


Arg 


Ala 
450 


Ala 


He 


Ala 


Arg 


Cys 
465 


His 


Thr 


Ala 


Pro 


Pro 
480 


Cys 


His 


Gin 


Gin 


Gly 

495 


Glu 


Val 


Glu 


Asp 


Leu 
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500 










Gly 


Thr 


Hxs 


Lys 


Pro 
515 


Pro 


Val 


Leu 


Arg 


Gin 


Cys 


Val 


Gly 


His 
53 0 


Arg 


Glu 


Ala 


Ser 


Hxs 


Ala 


Pro 


Gly 


Leu 
545 


Glu 


Cys 


Lys 


Val 


Ala 


Pro 


Gin 


Glu 


Gin 
560 


Val 


Thr 


Val 


Ala 


Leu 


Thr 


Gly 


Cys 


Ser 
575 


Ala 


Leu 


Pro 


Gly 


Ala 


Tyr 


Ala 


Val 


Asp 
590 


Asn 


Thr 


cys 


Val 


Ser 


Thr 


Thr 


Gly 


Ser 
605 


Thr 


Ser 


Glu 


Glu 


lie 


cys 


Cys 


Arg 


Ser 
620 


Arg 


His 


Leu 


Ala 


Gin 
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510 


Pro 


Arg Gly Gin 


Pro 


Asn 


520 










r~ o C" 

525 


lie 


His 


Ala 


Ser 


Cys 


Cys 


535 










540 


Lys 


Glu 


His 


Gly 


lie 


Pro 


550 










555 


Cys 


Glu 


Glu 


Gly Trp 


Thr 


565 










570 


Thr 


Ser 


His 


Val 


Leu 


Gly 


580 










585 


Val 


Arg 


Ser 


Arg Asp 


Val 


595 










600 


Ala 


Val 


Thr 


Ala 


Val 


Ala 


610 










615 


Gin 


Ala 


Ser 


Gin 


Glu 


Leu 


625 










630 
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Met 


Arg 


His 


Arg 


Thr 


Asp 


Leu 


Gly 


Gin 


Asn 


Leu 


Leu Leu Phe 


Leu 


1 








5 










10 
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320 










325 










330 


Glu 


Pro 


Lys 


Gin 


Glu 


Leu 


Glu 


Asp 


Glu 


Asn 


Pro 


Ala 


Arg 


Ser 


Gly 










335 










340 










345 


Gly 


Gly 


Gly 


Asn 


Ser 


Asp 


Glu 


Val 


Pro 


Pro 


Pro 


Thr 


Leu 


Pro 


Ser 










350 










355 










360 


Asp 


Pro 


Pro 


Arg 


Pro 


Pro 


Asp 


Pro 


Ser 


Pro 


Arg 


Arg 


Ser 


Arg 


Ala 
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370 










375 


Pro 


Arg 


Arg 


Arg 


Pro 


Arg 


Pro 


Arg 


Pro 


Gin 


Thr 


Arg 


Leu 


Arg 


Thr 










380 










385 










390 


Pro 


Pro 


Gin 


Pro 


Arg 


Pro 


Arg 


Pro 


Pro 


Pro 


Arg 


Pro 


Arg 


Pro 


Arg 










395 










400 










405 


Arg 


Gly 


Pro 


Gly 


Gly 


Gly 


Cys 


Leu 


Asp 


Val 


Asp 


Phe 


Ala 


Val 


Gly 










410 










415 










420 


Pro 


Pro 


Gly 


Cys 


Ser 


His 


Val 


Asn 


Ser 


Phe 


Lys 


Val 


Gly 


Glu 


Asn 






425 










430 










435 


Trp 


Arg 


Gin 


Glu 


Leu 


Arg 


Val 


He 


Tyr 


Gin 


cys 


Phe 


Val 


Trp 


Cys 










440 










445 










450 


Gly 


Thr 


Pro 


Glu 


Thr 


Arg 


Lys 


Ser 


Lys 


Ala 


Lys 


Ser 


Cys 


He 


Cys 










455 










460 










465 


His 


Val 


Cys 


Gly 


Thr 


His 


Leu 


Asn 


Arg 


Leu 


His 


Ser 


Cys 


Leu 


Ser 










470 










475 










480 


Cys 


Val 


Phe 


Phe 


Gly 


Cys 


Phe 


Thr 


Glu 


Lys 


His 


He 


His 


Glu 


His 










485 










490 










495 


Ala 


Glu 


Thr 


Lys 


Gin 


His 


Asn 


Leu 


Ala 


Val 


Asp 


Leu 


Tyr 


Tyr 


Gly 










500 










505 










510 


Gly 


lie 


Tyr 


Cys 


Phe 


Met 


Cys 


Lys 


Asp 


Tyr 


Val 


Tyr 


Asp 


Lys 


Asp 










515 










520 










525 


lie 


Glu 


Gin 


He 


Ala 


Lys 


Glu 


Glu 


Gin 


Gly 


Glu 


Ala 


Leu 


Lys 


Leu 










53 0 








535 










540 


Gin 


Ala 


Ser 


Thr 


Ser 


Thr 


Glu 


Val 


Ser 


His 


Gin 


Gin 


Cys 


Ser 


Val 










545 










550 










555 


Pro 


Gly 


Leu 


Gly 


Glu 


Lys 


Phe 


Pro 


Thr 


Trp 


Glu 


Thr 


Thr 


Lys 


Pro 










560 










565. 










570 


Glu 


Leu 


Glu 


Leu 


Leu 


Gly 


His 


Asn 


Pro 


Arg 


Arg 


Arg 


Arg 


He 


Thr 










575 










580 










585 


Ser 


Ser 


Phe 


Thr 


He 


Gly 


Leu 


Arg Gly 


Leu 


He 


Asn 


Leu 


Gly 


Asn 










590 










595 










600 


Thr 


Cys 


Phe 


Met 


Asn 


Cys 


He 


Val 


Gin 


Ala 


Leu 


Thr 


His 


Thr 


Pro 








605 










610 










615 


lie 


Leu 


Arg 


Asp 


Phe 


Phe 


Leu 


Ser 


Asp 


Arg 


His 


Arg 


Cys 


Glu 


Met 










620 










625 










63 0 


Pro 


Ser 


Pro 


Glu 


Leu 


Cys 


Leu 


Val 


Cys 


Glu 


Met 


Ser 


Ser 


Leu 


Phe 










635 








640 










645 


Arg 


Glu 


Leu 


Tyr 


Ser 


Gly 


Asn 


Pro 


Ser 


Pro 


His 


Val 


Pro 


Tyr 


Lys 










650 










655 










660 


Leu 


Leu 


His 


Leu 


Val 


Trp 


He 


His 


Ala 


Arg 


His 


Leu 


Ala 


Gly 


Tyr 










665 










670 










675 


Arg 


Gin 


Gin 


Asp 


Ala 


His 


Glu 


Phe 


Leu 


He 


Ala 


Ala 


Leu 


Asp 


Val 






680 










685 










690 


Leu 


His 


Arg 


His 


Cys 


Lys 


Gly 


Asp 


Asp 


Val 


Gly 


Lys 


Ala 


Ala 


Asn 










695 










700 










705 


Asn 


Pro 


Asn 


His 


Cys 


Asn 


Cys 


He 


He 


Asp 


Gin 


He 


Phe 


Thr 


Gly 










710 










715 










720 


Gly 


Leu 


Gin 


Ser 


Asp 


Val 


Thr 


Cys 


Gin 


Ala 


Cys 


His 


Gly 


Val 


Ser 








725 










730 










735 


Thr 


Thr 


He 


Asp 


Pro 


Cys 


Trp 


Asp 


He 


Ser 


Leu 


Asp 


Leu 


Pro 


Gly 
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740 745 750 



SGI. 


sjys ihjt oez^ 


i'ne 


Trp 


Pro 


Met 


Ser 


Pro 


Gly Arg 


Glu 


Ser 


Ser 






T C C 










'~l C f\ 

760 










765 


vax 


ASH \j±y isXM 


oer 


His 


He 


Pro 


- Gly 


He 


Thr 


Thr 


Leu 


Thr 


Asp 






/ / u 








llo 










A A 

780 


t-yS 


Lieu 


pne 


Thr 


Arg 


Pro 


Glu 


HXS 


Leu Gly 


Ser 


Ser 


Ala 






T Q C 










790 










795 


LiyS 


Xj.e LiyS CyS 


G±y 


Ser 


Cys 


T -1-1 

Gin 


Ser 


Tyr 


Gin 


Glu 


Ser 


Thr 


Lys 






o A rt 










805 










810 


Gin 


Leu Ttlt Met 


Asn 


Lys 


Leu 


Pro 


Val 


Val 


Ala 


cys 


Phe 


His 


Phe 






815 










820 








825 


Liys 


Arcf Pne Glu 


HXS 


Ser 


Ala 


Lys 


Gin 


Arg 


Arg 


Lys 


He 


Thr 


Thr 






83 0 










835 










840 


Tyx 


lie Ser Phe 


Pro 


Leu Glu 


Leu 


Asp 


Met 


Thr 


Pro 


Phe 


Met 


Ala 






845 










850 










855 


Ser 


Ser Lys Glu 


Ser 


Arg 


Met 


Asn 


Gly 


Gin 


Leu 


Gin 


Leu 


Pro 


Thr 






860 










865 










870 


Asn 


Ser Gly Asn 


Asn 


Glu 


Asn 


Lys 


Tyr 


Ser 


Leu 


Phe 


Ala 


Val 


Val 






875 










880 










885 


Asn 


His Gin Gly 


Thr 


Leu 


Glu 


Ser 


Gly 


His 


Tyr 


Thr 


Ser 


Phe 


He 






890 










895 










900 


Arg 


His His Lys 


Asp 


Gin 


Txrp 


Phe 


Lys 


Cys 


Asp 


Asp 


Ala 


Val 


He 






905 










910 










915 


Olir 


Lys Ala Ser 


He 


Lys 


Asp 


Val 


Leu 


Asp 


Ser 


Glu 


Gly 


Tyr 


Leu 






920 










925 






930 


Leu 


Phe Tyr His 


Lys 


Gin 


Val 


Leu 


Glu 


His 


Glu 


Ser 


Glu 


Lys 


Val 






935 










940 








945 


Lys 


Glu Met Asn 


Thr 


Gin 


Ala 


Tyr 

















950 

<210>-22 
<211> 2204 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 275791CB1 

<400> 22 

atatgccaat agacctgaca agtctgaatt ggaaactcag attgacagaa tgacgaagaa 60 
gagctttagc agctgtcttg gagataagta agagagatgc ttcaccatct ctgagtcatg 120 
aagatgatga taagccaact agcagcccag ataccggatt tgcagaagat gatattcaag 180 
aaatgccgga aaatccagac actatggaaa ctgagaagcc caaaacaatc acagagctgg 240 
atcctgccag ttttactgag ataactaaag actgtgatga gaatciaagaa aacaaaactc 300 
cagaaggatc tcagggagaa gttgattggc tccagcagta tgatatggag cgtgaaaggg 360 
aagagcaaga gcttcagcag gcactggctc agagccttca agagcaagag gcttgggaac 420 
agaaagaaga tgatgacctc aaaagagcta ccgagttaag tcttcaagag tttaacaact 480 
cctttgtgga tgcattgggt tctgatgagg actctggaaa tgaggatgtt tttgatatgg 540 
agtacacaga agctgaagct gaggaactga aaagaaatgc tgagacagga aatctgcctc 600 
attcgtaccg gctcatcagt gttgtcagtc acattggtag cacttcttct tcaggtcatt 660 
acattagtga tgtatatgac attaagaagc aagcgtggtt tacttacaat gacctggagg 720 
tatcaaaaat ccaagaggct gccgtgcaga gtgatcgaga tcggagtggc tacatcttct 780 
tttatatgca caaggagatc tttgatgagc tgctggaaac agaaaagaac tctcagtcac 840 
ttagcacgga agtggggaag actacccgtc aggcctcgtg aggaacaaac tcctgggttg 900 
gcagcatgca ctgcatattt gttactgctg cccacctcac ctttcctctg ctgaaggaga 960 
atttggaatt ctacttgatg cgggagcaac aaacagctca gggccaaacc aaaagacciaa 1020 
aattggagta acgtagaatg ctccatgcta ttttatggaa actttggtct cacatccgta 1080 
gctgattatc ctctttttct cctatgagtg gcacttcttt tgtcttagga atacatgttg 1140 
taaatatata tctgtgtatg tgtgtataca cacacacaga cacacacaca cacacacggg 1200 
atgaatggag ccttaaagag ttaggatgag ccaccagaat atgcctgctc aaaattaata 1260 
gcacagcagt ttggagaaga aatgaaggtg tcaaagagtc cattcacctg agaaatgtgt 1320 
gaagacatac ttatcagttg gcttttagct tttatgttcc ttgagtagtt tcactcaagt 1380 
ctgtaacctt ttgtgtttcc ttattagtaa aattcactgg aaagccagct cttcatgtta 1440 
cactaatgac agtttgttct ctttgcaaga gaggggcatt actgtcacct gacttgagga 1500 
gctgttttgt tgttgttgtt gtctgcaaat ttcatgaatt tgtgatgtct ttgctgttta 1560 
catgcagtcc caagaaatgg attgttggtg ctttggaata tgttacagtc ccacatttga 1.620 
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tatttcttat atactttgtt ttctctaagg agatttcttc acacagtatg ttcatcatat 1680 

atcatcatca ttattatggt ggtaaagata gaatcttttt tctttttttg tcattctggc 1740 

catggagcag cattacccta atggattgca accaaaactt taaacaagta gaaagataat 1800 

atttctccaa ttgggactcc ccagcaggaa tacttaggga tiaaggaagaa tgctagcatc 1860 

tctgtctctc aaacataggg aggataagaa gagtgttctt ctggtaaagc taaaattctg 1920 

gaccactgaa gctaaaagcc ctattgcaag tatgaaatta agtacttgag ctataggaca 1980 

aaccttgggc atttaaccat ttactgtctg gctttgccct taaaataggg ttgcaattaa 2040 

aatgtgatitg gcttaggtaa tcccaaaaac taacaaataa caaaggtgca taatttattt 2100 

atctactttt taggtgctct gagttgaggc aaagtagagc ggcaacatta agtgctatgc 2160 

tagtcactta gctgacgtaa ccagctttgg taagcagctt atga 2204 

<210> 23 
<211> 2036 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 1389845CB1 

<400> 23 

ccgatggggg ttaggctcca gggcttctgt cgagaccaag gatgcccaaa tatctgggtg 60 

gtgggtgctg catacctggg ccctgggcag aacgaagggt atacagcctg ggccaccagg 120 

ataagtccag aacccaccag gagctgagga cagacagaag gaccacggag ggggtgacgg 180 

gctggtgtga ggattggtgc ccctgggcca ggactctcct ctcttctccc tgctggctcc 240 

agaccagagt ccaagcccta ggcagtgcca cccttaccca gcccagcctt gaagacagaa 300 

tgagaggggt ttcctgtctc caggtcctgc tccttctggt gctgggagct gctgggactc 360 

agggaaggaa gtctgcagcc tgcgggcagc cccgcatgtc cagtcggatc gttgggggcc 420 

gggatggccg ggacggagag tggccgtggc aggcgagcat ccagcatcgt ggggcacacg 480 

tgtgcggggg gtcgctcatc gccccccagt gggtgctgac agcggcgcac tgcttcccca 540 

ggagggcact gccagctgag taccgcgtgc gcctgggggc gctgcgtctg ggctccacct 600 

cgccccgcac gctctcggtg cccgtgcgac gggtgctgct gcccccggac tactccgagg 660 

acggggcccg cggcgacctg gcactgctgc agctgcgtcg cccggtgccc ctgagcgctc 720 

gcgtccaacc cgtctgcctg cccgtgcccg gcgcccgccc gccgcccggc acaccatgcc 780 

gggtcaccgg ctggggcagc ctccgcccag gagtgcccct cccagagtgg cgaccgctac 840 

aaggagtaag ggtgccgctg ctggactcgc gcacctgcga cggcctctac cacgtgggcg 900 

cggacgtgcc ccaggctgag cgcattgtgc tgcctgggag tctgtgtgcc ggctaccccc 960 

agggccacaa ggacgcctgc caggtgtgca cccagcctcc ccagcctccg gagtcccctc 1020 

cctgtgccca gcaccctccc tccctgaact ccaggaccca ggacatccca actcaggctc 1080 

aggatcctgg cctccaacct agaggcacca cgccaggggt ctggaaccct gagaactgaa 1140 

gtcctgggag ggctgggact taggctcctc tttctcctgc agggtgattc tgggggacct 1200 

ctgacctgcc tgcagtctgg gagctgggtc ctggtgggcg tggtgagctg gggcaagggt 1260 

tgtgccctgc ccaaccgtcc aggggtctac accagtgtgg ccacatatag cccctggatt 1320 

caggctcgcg tcagcttcta atgctagccg gtgaggctga cctggagcca gctgctgggg 1380 

tccctcagcc tcctggttca tccaggcacc tgcctatacc ccacatccct tctgcctcga 1440 

ggccaagatg cctaaaaaag ctaaaggcca ccccaccccc cacccaccac ctcctgcctc 1500 

ctctcctctt tggggatcac cagctctgac tccaccaacc ctcatccagg aatctgccat 1560 

gagtcccagg gagtcacact ccccactccc ttcctggctt gtatttactt ttcttggccc 1620 

tggccagggc tgggcgcaag gcacgcagtg atgggcaaac caattgctgc ccatctggcc 1680 

tgtgtgccca tctttttctg* gagaaagtca gattcacagc atgacagaga tttgacacca 1740 

gggagatcct ccatagctgg ctttgaggac acggggacca cagccatgag cggcctctaa 1800 

gagctgagag acagccggca gggaatcgga accctcagac ccacagccgc aaggcactgg 1860 

attctggcag caccctgaag gagctgggaa gtaagttctt ccccagcctc cagataagag 1920 

ccccgccggc caatcccttc atttcaacct aaagagaccc taagcagaga acctagctga 1980 

gccactcctg acctacaaag ttgtgactta ataaatgtgt gctttaagct gctcca 203 6 

<210> 24 
<211> 2185 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 1726609CB1 

<400> 24 

gccatgcctc ctgcccacgg ccaccagcaa gctgtcgggc gcagtggagc agtggctgag 60 
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tgcagctgag cggctgtatg ggccctacat gtggggcagg tacgacattg tcttcctgcc 120 
accctccttc cccatcgtgg ccatggagaa cccctgcctc accttcatca tctcctccat 180 
cctggagagc gatgagttcc tggtcatcga tgtcatccac gaggtggccc acagttggtt 240 
cggcaacgct gtcaccaacg ccacgtggga agagatgtgg ctgagcgagg gcctggccac 3 00 
ctatgcccag cgccgtatca ccaccgagac ctacggtgct gccttcacct gcctggagac 3 60 
tgccttccgc ctggacgccc tgcaccggca gatgaagctt ctgggagagg acagcccggt 420 
cagcaaactg caggtcaagc tggagccagg agtgaatccc agccacctga tgaacctgtt 480 
cacctacgag aagggctact gcttcgtgta ctacctgtcc cagctctgcg gagacccaca 540 
gcgctttgat gactttctcc gagcctatgt ggagaagtac aagttcacca gcgtggtggc 600 
ccaggacctg ctggactcct tcctgagctt cttcccggag ctgaaggagc agagcgtgga 660 
ctgccgggca gggctggaat tcgagcgctg gctcaatgcc acaggcccgc cgctggctga 720 
gccggacctg tctcagggat ccagcctgac ccggcccgtg gaggcccttt tccagctgtg 780 
gaccgcagaa cctctggacc aggcagctgc ctcggccagc gccattgaca tctccaagtg 840 
gaggaccttc cagacagcac tcttcctgga ccggctcctg g'atgggtccc cgctgccgca 900 
ggaggtggtg atgagcctgt ccaagtgcta ctcctccctg ctggactcga tgaacgctga 960 
gatccgcatc cgctggctgc agattgtggt ccgcaacgac tactatcctg acctccacag 1020 
ggtgcggcgc ttcctggaga gccagatgtc acgcatgtac accatcccgc tgtacgagga 1080 
cctctgcacc ggtgccctca agtccttcgc gctggaggtc ttctaccaga cgcagggccg 1140 
gctgcacccc aacctgcgca gagccatcca gcagatcctg tcccagggcc tgggctccag 1200 
cacagagccc gcctcagagc ccagcacgga gctgggcaag gctgaagcag acacagactc 1260 
ggacgcacag gccctgctgc ttggggacga ggcccccagc agtgccatct ctctcaggga 1320 
cgtcaatgtg tctgcctagc cctgttggcg ggctgaccct cgacctccca gacaccacaa 1380 
ttgtgccttc tgtgggccag gcctgccatg actgcgtctc ggctctggcc atgagctctg 1440 
cccaggccca caagcccctc ccctgggctc tcccaggcag ggagaatggg gagagggacc- 1500 
tccttgtgtc tggcagagac ctgtggacct ggcctcccca ctcccagctc tcttgcactg 1560 
caggccctgg ggccagcccg cacacaccat gcctcctgtc tcaacactga cagctgtgcc 1620 
tagccccgga tgccagcacc tgccaggtgc cgccccgggg caagggcccc agcagcccta 1680 
tggtgaccgc cacactgtgc cttaatgtct gccgggggcc caggctgtgc tgtccctgca 1740 
gcacgcctcc ttgcagggat ctgagccacc ctccccgcac agccctgcac cccgccccta 1800 
gggttggcag cctcagttgg cccctggcag aggaacaagg acacagacat tccctcagtg 1860 
tggggggcag gggacacagg gagaggatgg ttgtccctgg ggagggccct ctggccccag 1920 
gcaaccttag cccctcagaa cagggagtcc caggacccag ggagagtgtg gggacaggac 1980 
agcctgtctc ttgtagcttc ctggggtggg aggcacaggg gcaaagcaat accccaggga 2040 
aagtgggagg tggtgctggt gctctctcca ggcccaccat gctgggagag gcggccagag 2100 
cctggggcct ccagcctggg actgctgtga tggggtatca cggtgatggt cccattaaac 2160 
ttccactctg caaaaaaaaa aaaaa 2185 

<210> 25 

<211> 3486 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eatuxe 

<223> Incyte ID No: 4503848CB1 

<400> 25 

ctgtcttaaa aaaagaggga gggaagatta ctgatttaat ttataaagga gaattattat 60 
agctccaaca cctgacttta tttatctata tggtttaatt acacaaacaa ttcagtgttt 120 
gaattataca aatttcatta aaactatgta attatgcaag aaaaatagga aatacagggg 180 
cacttagttt tgtgcatatg tgttcacctg agagtatttg cttgtttttt taaaaaggtt 240 
ctttttaatt taatatttaa ttttataatg cacattcata tgttgacttt ggaccaacag 3 00 
aaatccctaa ttcttattct ttttctgatt ctttttagag ttggtggttc caggatttta 360 
ctcagaatga cgttaggaag agaagtgatg tctcctcttc aggcaatgtc ttcctatact 420 
gtggctggca gaaatgtttt aagatgggat ctttcaccag agcaaattaa aacaagaact 480 
gaggagctca ttgtgcagac caaacaggtg tacgatgctg ttggaatgct cggtattgag 540 
gaagtaactt acgagaactg tctgcaggca ctggcagatg tagaagtaaa gtatatagtg 600 
gaaaggacca tgctagactt tccccagcat gtatcctctg acaaagaagt acgagcagca 660 
agtacagaag cagacaaaag actttctcgt tttgatattg agatgagcat gagaggagat 720 
atatttgaga gaattgttca tttacaggaa acctgtgatc tggggaagat aaaacctgag 780 
gccagacgat acttggaaaa gtcaattaaa atggggaaaa gaaatgggct ccatcttcct 840 
gaacaagtac agaatgaaat caaatcaatg aagaaaagaa tgagtgagct atgtattgat 900 
tttaacaaaa acctcaatga ggatgatacc ttccttgtat tttccaaggc tgaacttggt 9 60 
gctcttcctg atgatttcat tgacagttta gaaaagacag atgatgacaa gtataaaatt 1020 
accttaaaat atccacacta tttccctgtc atgaagaaat gttgtatccc tgaaaccaga 1080 
agaaggatgg aaatggcttt taatacaagg tgcaaagagg aaaacaccat aattttgcag 1140 
cagctactcc cactgcgaac caaggtggcc aaactactcg gttatagcac acat^ctgac 1200 
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ttcgtccttg aaatgaacac tgcaaagagc acaagccgcg taacagcctt tctagatgat 1260 
ttaagccaga agttaaaacc cttgggtgaa gcagaacgag agtttatttt gaatttgaag 1320 
aaaaaggciat gcaaagacag gggttttgaa tatgatggga aaatcaatgc ctgggatcta 1380 
tattactaca tgactcagac agaggaactc aagtattcca tagaccaaga gttcctcaag 1440 
gaatacttcc caattgaggt ggtcactgaa ggcttgctga acacctacca ggagttgttg 1500 
ggactttcat ttgaacaaat gacagatgct catgtttgga acaagagtgt tacactttat 1560 
actgtgaagg ataaagctac aggagaagta ttgggacagt tctatttgga cctctatcca 1620 
•agggaaggaa aatacaatca tgcggcctgc ttcggtctcc agcctggctg ccttctgcct 1680 
gatggaagcc ggatgatggc agtggctgcc ctcgtggtga acttctcaca gccagtiggca 1740 
ggtcgtccct ctctcctgag acacgacgag gtgaggactt actttcatga gtttggtcac 1800 
gtgatgcatc agatttgtgc acagactgat • tttgcacgat ttagcggaac aaatgt:ggaa 1860 
actgactttg tagaggtgcc atcgcaaatg cttgaaaatt gggtgtggga cgtcgattcc 1920 
ctccgaagat tgtcaaaaca ttataaagat ggaagcccta ttgcagacga tctgcttgaa 1980 
aaacttgttg cttctaggct ggtcaacaca ggtcttctga ccctgcgcca gattgttttg 2040 
agcaaagttg atcagtctct tcataccaac acatcgctgg atgctgcaag tgaatatgcc 2100 
aaatactgct cagaaatatt aggagttgca gctactccag gcacaaatat gccagctacc 2160 
tttggacatt tggcaggggg atacgatggc caatattatg gatatctttg gagtgaagta 2220 
ttttccatgg atatgtttta cagctgtttt aaaaaagaag ggataatgaa tccggaggtt 2280 
ggaatgaaat acagaaacct aatcctgaaa cctgggggat ctctggacgg catggacatg 2340 
ctccacaatt tcttgaaacg tgagccaaac caaaaagcgt tcctaatgag tagaggcctg 2400 
catgctccgt gaactgggga tctttggtag ccgtccatgt ctggaggaca agtcgacatc 2460 
accatgtgtt actggcctgg aaactgaagg gagttttgca agtgaaaatt tagatttcta 2520 
ttgacatcct tttgttttct aattttaaaa attataaaga tgtaaatgga attataaata 2580 
ctgtgaccta agaaaagacc cactagaaag taattgtact atiaaaatttc ataaaactigg 2640 
atttgatttc tttttatgaa agtttcatat gaatgtaact tgatttttta ctattataat 2700 
ctagataata tgatataaga gggctaagaa tttttaaatt gaatcatata tatgatataa 2760 
tttgatcctt cttgtatctt gaagttttgt acttgggatt tctggactga taaatgaatc 2820 
atcacattct tctggtaaat attttcttgg agctctgtgt caactttgat cctttgtctc 2880 
ccaggaaggt gtgacctctc ctttgcctgc atacctcaag gccaggggaa tatgcctcag 2940 
tgatgcattt atctttgtat atcaggccgc atgattccca actttctgcc acacttaaat 3 000 
tacgttcctc catttcagtt ttgtcttttc tgtctaaagt tcagtcaaag agtatcsiaaa 3 060 
aattatgttt cagctagact ggtgtaatgt ataagttttt gtatcttgta ttagaggatt 3120 
tcgtagcttt tattagaggc tcatttccac ctcagcatac aagatcgtta gtcttttggc 3180 
atgtgtgcca attagaatac taaagcaagt ccaagcacat ttttctcttc tcacgtttct 3240 
aataagtgtt agggactttg cctcttttac ttaccacgtc cccaaaagtg tcaggtagac 3300 
atgtcscasa tiggctctg^a gagagccatzg ggaagagaga ggaggtggati gtggaaca^a 3360 
aagggttcag aaactccaga agaggagtgg gttttggata gaagcatttg aggacagctg 3420 
ctccaaagcc ttatgtgtat gatgaaactt aaccacgggg aagagactct tcagtagcct 3480 
gttctg 3486 

<210> 26 

<211> 2847 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> nu.sc_f eature 

<223> Incyte ID No: 5544089CB1 

<400> 26 

caatgacgct tggacgagga tttatttcta caagctaatt gaatccagga gcagctttaa 60 
ttattaacac taacggaaga gaaaaggagt atttccaagg gctcaaatgg aagctgtact 120 
cagtccggtg gaggcagggg gaggtaaagt ttctcacact caagtcgtct t:cat:agt:tta 180 
ctgtcctttt ccaaacaaaa gctaataacg ccatacgcat ccacacactc cctcctggat 240 
gaacctaagt ctcgtcccca ctgtcacccc aaggccagtt atcaaaaact gttccttctc 300 
tgccctcaaa gactgaagcc gcaggccctg ttctgcctct gctcaggaat ctgattgctc 360 
ttaaagtgct cttacaagat tccgtcgatg tttgctccct ctgtcttgtc atcaggacta 420 
agtggtggag catcciaaagg tagaaagatg gaacttattc agccaaagga gccsiacttca 480 
cagtacattt ctctttgtca tgaattgcat actttgttcc aagtcatgtg gtctggaaag 540 
tgggcgttgg tctcaccatt tgctatgcta cactcagtgt ggagactcat tcctgccttt 600 
cgtggttacg cccaacaaga cgctcaggaa tttctttgtg aacttttaga taaaatacaa 660 
cgtgaattag agacaactgg taccagttta ccagctctta tccccacttc tcaaaggaaa 720 
ctcatcaaac aagttctgaa tgttgtaaat aacatttttc atggacaact. tcttagtcag 780 
gttacatgtc ttgcatgtga caacaaatca ciataccatag aacctttctg ggacttgtca 840 
ttggagtttc cagaaaggta tcaatgcagt ggaaaagata ttgcttccca gccatgtctg 900 
gttactgaaa tgttggccaa atttacagaa actgaagctt tagaaggaaa aatctacgta 960 
tgtgaccagt gtaactcaaa gcgtagaagg ttttcctcca aaccagttgt actcacagaa 1020 
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gcccagaaac aacttatgat atgccaccta cctcaggttc tcagactgca cctcaaacga 1080 
ttcaggtggt caggacgtaa taaccgagag aagattggtg ttcatgttgg ctttgaggaa 1140 
atcttaaaca tggagcccta ttgctgcagg gagaccctga aatccctcag accagaatgc 1200 
tttatctatg acttgtccgc ggtggtgatg caeca tggga aaggatttgg ctcagggcac 1260 
tacactgcct actgctataa ttctgaagga gggttctggg tacactgcaa tgattccaaa 1320 
ctaagcatgt gcactatgga tgaagtatgc aaggctcaag cttatatctt gttttatacc 13 80 
caacgagtta ctgagaatgg acattctaaa cttttgcctc cagagctcct gttggggagc 1440 
caacatccca atgaagacgc tgatacctcg tctaatgaaa tccttagctg atccaaagac 1500 
aatggggttt tcttcctgtg atttatatat atacttttta aaagactgat gtaccatttt 1560 
aaacttcatt ttttcttgtg aatcagtgta tactacattt atacatttta tatctaacaa 1620 
tttttttttt tacaaagtat aaatgtatat atcaactgaa ggtaactact tttttcatat 1680 
ttggagtttt aaacttttgg tgtttacctc agactgatgt tacctctttt atatttttat 1740 
gtcttaattg gctcggatga tgaacttgtg caatcttcta ccaacaaagt tcaagtggca 1800 
tcattttata tacatgtatc tttttcaggt attttctata caciattctta atagatggaa 1860 
aattagactc tactttggtc actaatagtc tttcatttgt atattgaagt taccttgccc 1920 
cttggagtta ttgaagtgac atgtcaaggt atcacctaaa tattcttcag tcacactcac 1980 
tggtatttct gaggctttgt gtgttaacag gccttgtaat tgacattatt ttggttaatg 2040 
taaccccaaa attgctttag taattgctct ttggcatagt caaactataa atgaaaatgg 2100 
cagctttaca aatagtatat ttaagtgaac tctggaacta tggacatgaa aaaaatgatg 2160 
gctgggattt atgatttttg tctggcagca aacaggtttg tccagaagtc taataattaa 2220 
gcagtcataa aaagtctgaa tttagtaaac cagtgtatga tgttattcaa atagtttacc 2280 
ttgggtatga gttcatttta taatgtctga tgacattaga tctcttaaaa ctttatgtat 2340 
tttttttagt tcaaaggaat agagtcttga agagaaaaaa ttatagggca gaaaagataa 2400 
gtgttcaaaa ttggcaactg gactattatt atgtctagca tctcattcta aataactaaa 2460 
gcttgattta ctcttgctag gat.tatgt:ga ctactaggta ggagcctctt aaaacactgg 2520 
ccctgagcat taaaaaaaaa aaaaaaaact aaaagctatc tatctaaact tgcaaaaaaa 2580 
aaattccggt gggggtcacc cttttccttc ttctgaaaat ctcacggggt ttctttaaag 2640 
ccctgttgct gcaaacttta tcttttttgg ggggggtaga atcacctaat ctctgtagac 2700 
cagctatgtt tctaagctct gttaaccacg gggagatctg gtaccccttt tttaaaaggg 2760 
ggtttatttg cgggttgaag tcttagtgaa aagtagtccc ctggagaatg cggtccaccc 2820 
ctgggggcca tctgttaggt aaaactt 2847 

<210> 27 

<211> 890 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7474081CB1 

<400> 27 

gaggccaaga attcggcacg aggcacttiac tccctgagct aagggggaag agctggatca 60 
ccatgEiaata tgtcttctat ttgggtgtcc tcgctgggac atttttcttt gctgactcat 120 
ctgttcagaa agaagaccct gctccctatt tggtgtacct caagtctcac ttcaacccct 180 
gtgtgggcgt cctcatcaaa cccagctggg tgctggcccc agctcactgc tatttaccaa 240 
atctgaaagt gatgc tggga aatttcaaga gcagagtcag agacggtact gaacagacaa 300 
ttaaccccat tcagatcgtc cgctactgga actacagtca tagcgcccca caggatgacc 360 
tcatgctcat caagctggct aagcctgcca tgctcaatcc caaagticcag ccccttaccc 420 
tcgc caeca c caatgtcagg ccaggcactg tctgtctact ctcaggtttg gactggagcc 480 
aagaaaacag tggccgacac cctgacttgc ggcagaacct ggaggccccc gtgatgtctg 540 
atcgagaatg ceaaaaaaca gaacaaggaa aaagccacag gaattcctta tgtgtgaaat 600 
ttgtgaaagt attcagccga atttttgggg aggtggccgt tgctactgte atctgcaaag 66" 
acaagctcca gggaatcgag gtggggcact tcatgggagg ggacgtcggc atctacacca 72 
atgtttacaa atatgtatcc tggattgaga acactgctaa ggacaagtga gaccctaett 78 
ctccctctgc attccactgg ctctgccatg gactatacaa gcagataatt ttccctctat 84 
tcaaaataaa atetccaaat gaaaatttgg gaatgtagca aaaaaaaaaa 



0 
0 
0 
840 
890 



<210> 28 

<211> 1577 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 5281209CBa 
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<400> 28 

atgcagccca 

ctgctcctgc 

ggcgccccca 

cccagcgccc 

cggggtcgcg 

ccccaggtcc 
agccatggtc 
tcagcctccg 
attgacctca 
gaaggtctga 
ctggacagca 
cttaccttca 
tacaccaacg 
ctgggcatga 
gtgtctcagg 
ttgaatgttc 
acactgtcca 
cactttgacc 
gacgggactg 
gaggagttgc 
ctgctgcggg 
gtggaggctg 
cagaatggac 
tggaggtcct 
ccaaccttca 
agcctcacaa 
acacatggaa 



cgggccgcga 
tact get get 
gagccetetc 
tcactacccc 
egcaggccct 
tgagacagcg 
agaccagcct 
tctcatgeca 
tteaccgeat 
acagctctca 
gcctctctgt 
cetgcagtae 
'tcagcggatt 
tgatagattt 
etcctgtgat 
ccgatgatat 
tgggggtgct 
aeatcagggc 
gccggttccc 
tgagtcgtag 
tettcagaca 
agtttccata 
accaggctac 
eaaatgeete 
ceeagtggct 
agecccetct 
aaaaaaa 



gggttecegc 
getgetgegg 
caegetgggc 
aggcctcact 
gatgeggagt 
ttaeaagaat 
ggaeaggctt 
gtcecaggae 
gtgtgeetee 
aaagetggec 
gctgcgcagt 
accatgggea 
gacaagcttt 
gtectatgca 
cttetcccac 
ectgeagett 
gcagtgcaac 
agteattgga 
teaggggctg 
etggagcgag 
ag t ggaaaag 
tgggcaaetg 
tcatctggag 
cecatacctt 
etgetgaeae 
cctagttcat 



gc get cage c 
eageecgtaa 
tceeecagee 
acgccaggca 
ttcccactcg 
gtgctteagg 
agagaeggce 
cagactgceg 
tactctgaac 
tgccteattg 
ttetatgtgc 
gagagttcca 
ggtgagaaag 
tcggacacct 
tcagctgcca 
ctgaagaaga 
etgcttgcta 
tctgagttca 
gaggatgtgt 
gaagagcttc 
gtgagagagg 
ageaeatcet 
gtgaccaagc 
gttccaggcc 
agtcggteee 
tcacaagcat 



ggcggtatct 
eecgcgcgga 
tcttcaceac 
cccccaaaac 
tggacggcca 
atgttaacct 
tcgtgggtgc 
tgegcctcge 
tcgagcttgt 
gcgtggaggg 
tgggggtgcg 
ceaagtteag 
tagtagagga 
tgataagaag 
gagctgtgtg 
aeggtggeat 
acgtgtccac 
tcgggattgg 
ecaeatacce 
aaggtgtcct 
agagcagggc 
gceaetccea 
agccaaceaa 
ttgtggetgc 
egeagaggte 
atgctgagaa 



gcggcgtctg 
gaccacgccg 
gecgggtgtc 
cctggaeett 
caatgaeetg 
gegaaatttc 
ccagttctgg 
cctggagcag 
gacctcagct 
tggtcactea 
ctacctgaca 
aeaecaeatg 
gttj^aaccgc 
ggtcctggaa 
tgacaatttg 
egtgatggtg 
tgtggcagat 
tggaaattat 
agtectgata 
tcgtggaaac 
gcagagccec 
eetegtgeet 

tcgggtcecc 

tgeeaccatc 
actgtggcaa 
taaacatgtt 



<210> 
<211> 
<212> 
<213> 

<220> 
<221> 
<223> 



29 

1958 

DNA 

Homo 



sapx ens 



mi s c_f ea t ur e 
Incyte ID No 



2256251CB1 



<400> 29 

aagcggtcga 

ctgctegeat 

accaecaetg 

ectectcgaa 

tgccgtgtgg 

tctgtetaca 

aattateeag 

atttcttctt 

caattcagag 

teeagcttca 

cetageaecc 

gggccetetg 

cttagaaaag 

gcacccacga 

aagaaggcag 

ggatecccae 

gatgeaggcg 

geeagectcc 

gtgctcacag 

ctgggggaae 

caetccagce 

gtccccgtga 

ttctgceetg 

ctgccaeccc 

cgecgggact 

ggcceegggg 

gcctgggtgc 

ggagtctaca 



gctcggcatt 
etcteggttg 
tagtcagtct 
ctttctcecc 
gagagccggt 
tcecagettc 
eteaggeaga 
aaettactga 
gccccgagca 
aatatetcea 
cgggccttct 
cagcctttcc 
aegacctgag 
tgettctggg 
ggaggtgtgg 
eaggeacecc 
gceggategt 
gcetgcggag 
ctgcccactg 
tggagateae 
cetcaggaea 
ccctctteag 
ggatccggtg 
cgtacagcct 
atcceggccc 
atgcctgcea 
aggetggeat 
ctcgtgtccc 



eattgtaacg 
agtggggetg 
gtacettggg 
aeagcgtccc 
gceeeagcce 
cctteeatte 
cceaggtgtg 
ggtgaagttt 
ccttggacag 
eeaeecctga 
cgggggctcc 
accgectccg 
eceeaggcge 
gagagtctgg 
teagggaagg 
ttcctcctte 

ggggggteac 

gcftgeaegtg 
cttctccggg 
tctgtcteec 
gceggggacc 
ccggatcctg 
ctgggtgace 
gcgggaggtg 
cgggggcagc 
ggacgactcc 
tgtgagetgg 
tgeetacgtg 



gcgecatgtg 
gteggggtgt 
agatgctctg 
caeegtggce 
acaceegetg 
agcccagtgg 
gccagtgggt 
acagaagata 
tgetg^ctgc 
aggaaaaceg 
accgttetga 
ggagcectgg 
gcteaetget 
cagaegagag 
ettcatggag 
gacttagggt 
getgcceegg 
tgcggegggt 
tccctgaact 
cacttctcca 
agcggggaca 
cccgtctgcc 
ggetggggct 
aaagtctccg 
atccttcagc 

ggggggcctc 

ggtgaggget 
aactggatcc 



ctggaaaggt 
getgeaggge 
aggceatgaa 
c tggaaccag 
cctatetata 
geacaeteea 
ctecagetga 
aagttacatt 
eccccgcece 
gggcecaeta 
gcecetttct 
ttagtttgtg 
cctgagacgt 
agctgaagag 
gaagtgcagt 
gtggecggce 
ecggcgeatg 
eactgctcag 
catccgaeta 
eegtgaggca 
tegccctggt 
tceeggaggc 
ataegcggga 
tggtggacac 
ccgaeatgct 
tggtetgcea 
gcggccgccc 
gcegccacat 



egtgtggttt 
tgteteccec 
acaaeetggc 
ctgggggctt 
aacatctetg 
teaecagcac 
ettectctta 
atcaagegte 
ccgaatttta 
ggeagecacc 
agccgcctag 
gageatggtg 
eatttctget 
caaagteece 
gggcttcttg 
gcaggtttcg 
gecatggeag 
cccccagtgg 
ccaggtgcac 
gatcatcetg 
ggagctcagt 
ctcagatgac 
gggagagcct 
agagacctgc 
gtgtgcccgg 
ggtgciacggt 
eaacaggceg 
cacagcatca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1577 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 
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gffcfffcrctcag agtctgggta ccccaggctc cccctcctgg ctggcttatt cctccccggc 1740 
ctcttccttc tgctagtctc ctgtgtcctg ctggccaagt gcctgctgca cccatctgcg 1800 
gatggtactc ccttccccgc ccctgactga tggcaggaat ccaagtgcat ttcttaaata 1860 
agttactatt tattccgctc cgccccctcc ctctcccttg agaagctgag tcttctgcat 1920 
cagattattg caacatttaa cctgaattta acgacacc 1958 



<210> 
<211> 
<212> 
<213> 

<220> 
<221> 
<223> 



30 

3106 

DNA. 

Homo 



sapiens 



inisc_f eatiire 
Incyte ID No 



7160544CB1 



<400> 30 

gctccgaggc 

tgggttgtca 

gcgcagcatg 

ccggggggaa 

gatatttgaa 

gcctttttat 

aaaatatcat 

tgatccagat 

agaaaataca 

gctctcttgg 

tcgagaagaa 

cgattatcac 

aaaagatgga 

tagttgtccc 

ttttatacat 

cacttatgtg 

tacctttgtt 

tgaaacaact 

tgaggtggaa 

ccgttatcct 

gattgatgct 

gattctattt 

tgcttggtcc 

tgaattattt 

gcctgattct 

ccatgacatc 

ctctgaatgc 

caaatataaa 

agaggagata 

ccaagttgat 

gcatcacctg 

tggctactca 

taaccagaag 

aacttgcaaa 

ctatactcct 

gctctacaag 

tggtggtcct 

gaatacccta 

ccgagggctt 

tcaggtggaa 

gggcatccac 

agatatcttc 

aggatacacg 

atctgtggcc 

tggtttcctg 

gagggctgga 

tcctgaatcg 

atcacgtatt 

cactggctat 

cattttgata 

tctacggttt 



caaggccgct 
. ccggcgccgc 
aagcggcgca 
ggaaaatgca 
actgcggact 
gttgagcggt 
ggctacatga 
ggacctcatt 
ctgttttatt 
aagcctcttt 
gaactattaa 
caaggaagtg 
gggccacaag 
aacatacgga 
agcaacgata 
cacaatgagc 
ctccaagaag 
cccagtggtg 
attattcatg 
aaaacaggta 
gaaggaagga 
gaaggagttg 
atcctactag 
atcccagtag 
gtgacgccac 
tttcatgttt 
aaaacaggtt 
cgat:ccagtg 
gcaattacca 
gaagtcagaa 
tacgtagtca 
cattcttgct 
aatccacact 
acaaaggaat 
ccagaaattt 
cctcatgatc 
caggtgcagt 
gcctctctag 
aaatittgaag 
ggactccaat 
ggctggtcct 
agggttgcta 
gaacgttata 
atgcaagcag 
gatgagaatg 
aagccatatg 
ggagaacatt 
gctgctctaa 
ttaaccaaat 
cctgccatgt 
gtggtagtaa 



gctactgccg 
cgctgaggaa 
ggcccgctcc 
acatggcagc 
gtgaggagaa 
attcctggag 
tggctaaggc 
cagacagaat 
ctgaaattcc 
tggatctttt 
gagaaagaaa 
gaacatttct 
gatttacgca 
tggatccaaa 
tttggatatc 
tagccaacat 
aatttgatag 
gtaaaattct 
ttacatcccc 
cagcaaatcc 
tcatagatgt 
aatatattgc 
atcgctccca 
aagatgatgt 
taattatcta 
ttccccaaag 
tccgtcattt 
gtgggctgcc 
gtggtgaatg 
ggctggtata 
gttacgtaaa 
gcatcagtca 
gtgtgtccct 
tttgggccac 
tctcttttga 
tacagcctgg 
tggtgaataa 
gttatgtggt 
gcgcctttaa 
atctagcttc 
atggaggata 
ttgctggggc 
tgggtcaccc 
aaaagttc cc 
tccattttgc 
atttacagat 
atgaactgca 
aagtgatata 
gaggaggttt 
aac ate tact 
tctaatacct 



ccgctgcttc 
gccactgcaa 
atagcgcacg 
agcaatggaa 
tattgaatca 
tcagcttaaa 
accacatgat 
ctattacctt 
caaaactatc 
tcaggcaaca 
acgcattgga 
gtttcaagcc 
acaaccttta 
attatgccct 
taacatcgta 
ggaagaagat 
atattctggc 
tagaattcta 
tatgttggaa 
taaagtcact 
catagataag 
cagagctgga 
gactcgccta 
tatggaaagg 
tgaagaaaca 
tcacgaagag 
atacaaaatt 
tgctccaagt 
ggaagttctt 
ttttgaaggc 
tcctggagag 
gcactgtgac 
ttacaagcta 
cattttggat 
aagtactact 
aaagaaatat 
tcggtttaaa 
tgtagtgata 
atataaaatg 
tcgatatgat 
cctctccctg 
cccagtcact 
tgaccagaat 
ctctgaacca 
acataccagt 
ctatcctcag 
tcttttgcac 
attttgacct 
aatcaacaga 
cctgaaaata 
taaccccaca 



ttagtgccgc 
ccaggaccgg 
tcgggacggt 
acagaacagc 
caggatcggc 
aagctgcttg 
ttcatgtttg 
gccatgtctg 
aatagagcag 
ctggactatg 
acagtcggaa 
ggtagtggaa 
aggcccaatc 
gctgatccag 
accagagaag 
gccagatcag 
tattggtggt 
tatgaagaaa 
acaaggaggg 
tttaagatgt 
gaactaattc 
tggactcctg 
cagatagtgt 
cagagactca 
acagacatct 
gaaattgagt 
acatctattt 
gatttcaagt 
ggccggcatg 
accaaagact 
gtgacaaggc 
ttctttataa 
tcaagtcctg 
tcagcaggtc 
ggatttacat 
cctactgtgc 
ggagtcaagt 
gacaacaggg 
ggtcaaatag 
ttcattgact 
atggcattaa 
ctgtggatct 
gaacagggct 
aatcgtttac 
atattactga 
gagagacaca 
taccttcaag 
gtgtagaact 
aaacacagaa 
aatgtggtgc 
tgctcaaaat 



gttcgccgcc 60 
agtggaggcg 120 
ccgggcgggg 180 
tgggtgttga 240 
ctaaattgga 300 
ccgataccag 360 
tgaagaggaa 420 
gtgagaacag 480 
cagtcttaat 540 
gaatgtattc 600 
ttgcttctta 660 
tttatcacgt 720 
tagtggaaac 780 
actggattgc 840 
aaaggagact 900 
c tggag t cgc 960 
gtccaaaagc 1020 
atgatgaatc 1080 
cagattcatt 1140 
cagaaataat 1200 
aaccttttga 1260 
agggaaaata 1320 
tgatctcacc 1380 
ttgagtcagt 1440 
ggataaatat 1500 
ttatttttgc 1560 
taaaggaaag 1620 
gtcctatcaa 1680 
gatctaatat 1740 
cccctttaga 1800 
tgactgaccg 1860 
gtaagtatag 1920 
aagatgaccc 1980 
ctcttcctga 2040 
tgtatgggat 2100 
tgttcatata 2160 
atttccgctt 2220 
gatcctgtca 2280 
aaattgacga 2340 
tagatcgtgt 2400 
tgcagaggtc 2460 
tctatgatac 2520 
attacttagg 2580 
tgctcttaca 2640 
gttttttagt 2700 
gcataagagt 2760 
aaaaccttgg 2820 
ctctggtata 2880 
ttgatcatca 2940 
catgcagggg 3000 
caaatgatac 3060 
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atattcctga gagacccagc aataccataa gaattactaa aaaaaa 
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3106 



<210> 
<2H> 
<212> 
<213> 



31 

3567 

DNA 

Homo 



sapxens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 74773 86CB1 



<400> 31 
atggctccac 
gccgcgggca 
acagtgccct 
gcagcagcct 
agtcacctcc 
gtggggcgcc 
ctgcggccca 
cgggagctgt 
atgcctgggg 
gacagcaccg 
gggaggacac 

ggggacctgc 

ggggaccagc 
atcgaggtgc 
cagaactatg 
ctgggggttc 
tccctgagcc 
gcacactccc 
ctcacccggc 
cccctgagga 
catgagaccg 
gagaccagcc 
tggtcccgct 
gatgacccct 
atggatgagc 
acctttgagc 
accaagaagg 
ggtcactgca 
tggaccaagt 
agctgcaaca 
taccaggtct 
tgtgccaagc 
gagcctgacg 
gtggtgttca 
agcgtctgtg 
aaggcggatg 
gggacgctgg 
ggtgccaggc 
aaccaggtca 
ttcaccgcca 
accagcgggc 
ccccgcagca 
agcaacaatg 
ccctgcagca 
cgagaccacc 
cgccggcgct 
gcctgcagcc 
cccctctcca 
gaggcccgac 
tcccagtgct 
aacgccaaca 
ctgcccgcct 
gtgctcgatc 
aagaaggcct 
tccactcctg 



tccgcgcgct 
gccggacccc 
gcagcacaga 
ctgcagggag 

gggtggctcg 

actccctcta 
atcggaggtt 
tccggcagcc 
cagctgttgc 
acttcttcat 
atgtggtgta 
acaatgaagc 
tgggcgacac 
tgctggtggt 
tcctcaccct 
atataaatat 
tgatcgagcg 
agcagcgcca 
aggactttgg 
gctgtgccct 
gccacgtgct 
tgggcagcgt 
gcagcaagct 
ttgatcctgc 
agtgccgctt 
cctgcaagca 
ggcccccgct 
tctggaagtc 
ttgggtcatg 
acccctcgcc 
gcaacagcga 
gcaactccta 
atgacgccca 
tgaaccaggt 
cgcgtggcga 
acaagtgtgg 
gcaaggcctc 
acatccagat 
ccggcagctt 
tgggcctgga 
ccctgcctga 
gcctggccta 
tgctcctgga 
aggcctgtgg 
acatggtgca 
gcaaccagca 
ggagctgtgg 
atggaaccca 
ggccctgtct 
ctgccacctg 
gcctcgggca 
gtggagcgga 
gctactgctc 
cgggccccaa 
gaagccGctt 



gctgtcctac 
agagctgcac 
ctttcgggga 
catggtagtg 
cagccctctg 
cttcaatgtc 
ggtagtgcca 
cttacggcag 
catcagcaac 
tgagcctctg 
ccgccgggag 
ctttggcctg 
agagcggaag 
ggacgactcg 
catgaatatc 
tgccctcgtc 
cgggaacccc 
ggaccccagc 
gccctcaggt 
caaccatgag 
cggcatggag 
catggcgccc 
ggagctcagc 
ctggccccag 
tgactttggc 
gctgtggtgc 
ggatgggact 
gccggagcag 
ttcgcggtca 
agcctatgga 
ggagtgccct 
ctatgtgcac 
gaagtgtgag 
ggttcacgat 
gtgtgtgcct 
agtctgcggg 
caagcaggca 
tgaggcactg 
catcctcaac 
gtgggaggat 
agccattgcc 
caagtacgtc 
ggagatggac 
aggaggtatc 
gcgacacctg 
cccgtgctct 
gaagctgggg 
caaggtcatg 
ccgagtgccc 
tggagagggc 
ttgcgagggg 
gccctgcacg 
cattcccggc 
ccctggccca 
accaggaccc 



ctgctgcctt 
ctctctggaa 
cgcttcctct 
gacacgccac 
cacccaggag 
actgttttcg 
ggatcctcag 
gagtgtgtgt 
tgtgacggat 
gagcggggcc 
gccgtccagc 
ggagaccttc 
cggcggcatg 
gtggttcgct 
gtggtagatg 
cgcttgatca 
tcacgcagcc 
cacgctgagc 
gggtatgcac 
gatggcttct 
catgacggtc 
ctggtgcagg 
cgctacctcc 
cccccagagc 
agtggctacc 
agccatcctg 
gagtgtgcac 
acatatggcc 

tgtgggggcg 

ggccgcctgt 
gggacctacg 
cagaatgcca 
ctgatctgcc 
gggacacgct 
gtcggctgtg 
ggtgacaact 
ggtgctctca 
gagaagtccc 
CGcaagggca 
gcggtggagg 
atcctggctc 
atccatgagg 
acctatgagt 
cagttcacca 
tgtgaccaca 
cagcctgtgt 
gtgcagacac 



ccggccaaag 
tgcccagccc 
ate cage age 
gataggccag 
ggagacaggt 
taccaccggc 
gaccctggcc 
caggaccctg 



tgcactgtgc 
agctcagtga 
eccacgtggt 
ccacactacc 
ggaccctgtg 
ggaaggaact 
tggagtggca 
acactggagg 
tggcgggcct 
agcaggagaa 
aggagtgggc 
ccaacctgct 
ccciagccagg 
tccatggcaa 
agatttacca 
tggttggcta 
tggageaggt 
accatgacca 
ccgtcactgg 
cctcagcctt 
aggggaatgg 
ctgccttcca 
cgtcctacga 
tgcctgggat 
agacctgett 
acaacccgta 
ceggcaagtg 
aggatggagg 
gggtgcgatc 
gcttagggcc 
aggacttccg 
agcacagctg 
agtcggcgga 
gcagctaccg 
acciaggaggt 
cccactgcag 
agctggtgca 
cceaccgcat 
aggaagceac 
atgccaagga 
tccccecaac 
acctgetgec 
ggcTcgctcaa 
aatacggctg 
agaagaggcc 
gggtgacgga 

gggggataca 

cctgcgccgg 
agtggaggct 
ggcaggtggt 
acactgtcca 
ctgtcttctg 
tctgctgtgt 
caaectcact 
cagatgctgc 



gctctgcgcc 60 
ctatggtgtg 120 
gtctggccca 180 
acgacactcc 240 
gcctggcagg 300 
gcacttgcgc 360 
ggaggatttt 420 
tgtcactgga 480 
catccgcaca 540 
ggaggc cage 600 
agaacetgae 660 
gggcctggtg 720 
cage tacagc 780 
ggagcatgtg 840 
cgatgagtcc 900 
ccgacagcag 960 
gtgtcgctgg 1020 
cgttgtgttc 1080 
catgtgtcac 1140 
cgtgatagct 1200 
c tgtgcagat 1260 
ccgcttccat 1320 
ctgcctcctc 1380 
caactactca 1440 
ggcattcagg 1500 
cttctgcaag 1560 
gtgcttcaaa 1620 
ctggagctec 1680 
cegeageegg 1740 
catgttcgag 1800, 
ggeecagcag 1860 
ggtgccetae 1920 
cacgggggac 1980 
ggacceatac 2040 

ggggtccatg 2100 

gactgtgaag 2160 
gatcecagca 2220 
tgtggtgaag 2280 
aagccggacc 2340 
aagcctcaag 2400 
tgagggtggc 2460 
ccttatcggg 2520 
gagetgggcc 2580 
c cggcgcaga 2640 
eaagcccatc 2700 
ggagtggggt 2760 
g tgc c tgc tg 2820 
ggaccggcct 2880 
gggagcctgg 2940 
g tgcaggac c 3000 
ggtc tgcagc 3 060 
ccagatggaa 3120 
gtcctgcatc 3180 
gcccecette 3240 
agagcctcct 3300 
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ggaaagccaa cgggatcaga ggaccatcag catggccgag ccacacagct cccaggagct 33 60 
ctggatacaa gctccccagg gacccagcat ccctttgccc ctgagacacc aatccctgga 3420 
gcatcctgga gcatctcccc taccaccccc ggggggctgc cttggggctg gactcagaca 3480 
cctacgccag tccctgagga caaagggcaa cctggagaag acctgagaca tcccggcacc 3540 
agcctccctg ctgcctcccc ggtgaca 3567 

<210> 32 

<211> 2930 

<212> DMA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID Ko: 7473089C31 

<400> 32 

cacgcagacc gcggcagcgg ccgagagccc ggcccagccc cttcccacag cgcggcgttg 60 
cgctgcccgg cgccatgctt ctgctgggca tcctaaccct ggctttcgcc gggcgaaccg 120 
ctggaggctc tgagccagag cgggaggtag tcgttcccat ccgactggac ccggacatta 180 
acggccgccg ctactactgg cggggtcccg aggactccgg ggatcaggga ctcatttttc 240 
agatcacagc atttcaggag gacttttacc tacacctgac gccggatgct cagttcttgg 300 
ctcccgcctt ctccactgag catctgggcg tccccctcca ggggctcacc gggggctctt 360 
cagacctgcg acgctgcttc tattctgggg acgtgaacgc cgagccggac tcgttcgctg 420 
ctgtgagcct gtgcgggggg ctccgcggag cctttggcta ccgaggcgcc gagtatgtca 480 
ttagcccgct gcccaatgct agcgcgccgg cggcgcagcg caacagccag ggcgcacacc 540 
ttctccagcg ccggggtgtt ccgggcgggc cttccggaga ccccacctct cgctgcgggg 600 
tggcctcggg ctggciacccc gccatcctac gggccctgga cccttacaag ccgcggcggg 660 
cgggcttcgg ggagagtcgt agccggcgca ggtctgggcg cgccaagcgt ttcgtgtcta 720 
tcccgcggta cgtggagacg ctggtggtcg cggacgagtc aatggtcaag ttccacggcg 780 
cggacctgga acattatctg ctgacgctgc tggcaacggc ggcgcgactc taccgccatc 840 
ccagcatcct caaccccatc aacatcgttg tggtcaaggt gctgcttctt agagatcgtg 900 
actccgggcc caaggtcacc ggcaatgcgg ccctgacgct gcgcaacttc tgtgcctggc 9 60 
agaagaagct gaacaaagtg agtgacaagc accccgagta ctgggacact gccatcctct 1020 
tcaccaggca ggacctgtgt ggagccacca cctgtgacac cctgggcatg gctgatgtgg 1080 
gtaccatgtg tgaccccaag agaagctgct ctgtcattga ggacgatggg cttccatcag 1140 
ccttcaccac tgcccacgag ctgggccacg tgttcaacat gccccatgac aatgtgaaag 1200 
tctgtgagga ggtgtttggg aagctccgag ccaaccacat gatgtccccg accctcatcc 1260 • 
agatcgaccg tgccaacccc tggtcagcct gcagtgctgc catcatcacc gacttcctgg 1320 
acagcgggca cggtgactgc ctcctggacc aacccagcaa gcccatctcc ctgcccgagg 1380 
atctgccggg cgccagctac accctgagcc agcagtgcga gctggctttt ggcgtgggct 1440 
ccaagccctg tccttacatg cagtactgca ccaagctgtg gtgcaccggg aaggccaagg 1500 
gacagatggt gtgccagacc cgccacttcc cctgggccga tggcaccagc tgtggcgagg 1560 
gcaagctctg cctcaaaggg gcctgcgtgg agagacacaa cctcaacaag cacagggtgg 1620 
atggttcctg ggccaaatgg gatccctatg gcccctgctc gcgcacatgt ggtgggggcg 1680 
tgcagctggc caggaggcag tgcaccaacc ccacccctgc caacgggggc aagtactgcg 1740 
agggagtgag ggtgaaatac cgatcctgca atctggagcc ctgccccagc tcagcctccg 1800 
gaaagagctt ccgggaggag cagtgtgagg ctttcaacgg ctacaaccac agcaccaacc 1860 
ggctcactct cgccgtggca tgggtgccca agtactccgg cgtgtctccc cgggacaagt 1920 
gcaagctcat ctgccgagcc aatggcactg gctacttcta tgtgctggca cccaaggtgg 1980 
tggtggacgg cacgctgtgc tctcctgact ccacctccgt ctgtgtccaa ggcaagtgca 2040 
tcaaggctgg ctgtgatggg aacctgggct ccaagaagag attcgacaag tgtggggtgt 2100 
gtgggggaga caataagagc tgcaagaagg tgactggact cttcaccaag cccatgcatg 2160 
gctacaattt cgtggtggcc atccccgcag gcgcctcaag catcgacatc cgccagcgcg 2220 
gttacaaagg gctgatcggg gatgacaact acctggctct gaagaacagc caaggcaagt 2280 
acctgctcaa cgggcatttc gtggtgtcgg cggtggagcg ggacctggtg gtgaagggca 2340 
gtctgctgcg gtacagcggc acgggcacag cggtggagag cctgcaggct tcccggccca 2400 
tcctggagcc gctgaccgtg gaggtcctct ccgtggggaa gatgacaccg ccccgggtcc 2460 
gctactcctt ctatctgccc aaagagcctc gggaggacaa gtcctctcat cccccgcacc 2520 
cccggggagg accctctgtc ttgcacaaca gcgtcctcag cctctccaac caggtggagc 2580 
agccggacga caggccccct gcacgctggg tggctggcag ctgggggccg tgctccgcga 2640 
gctgcggcag tggcctgcag aagcgggcgg tggactggcg gggctccgcc gggcagcgca 2700 
cggtccctgc ctgtgatgca gcccatcggc ccgtggagac acaagcctgc ggggagccct 2760 
gccccacctg ggagctcagc gcctggtcac cctgctccaa gagctgcggc cggggatttc 2820 
agaggcgctc actcaagtgt gtgggccacg gaggccggct gctggcccgg gaccagtgca 2880 
acttgcaccg caagccccag gagctggact tctgcgtcct gaggccgtgc 293 0 

<210> 33 
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<211> 423 0 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7604035CB1 

<400> 33 

agcgaggttg cctggagaga gcgcctgggc 
gcagagcagg agggtgctct cggacggtgt 
cagggtcgcc gcgagggacg cagagagcac 
gtgaccagtc cgctcctgcc tccccctggg 
ctggcaggcc ctggctgtat ctgctgctgc 
aggaggtgtt gtccggacac tctcttcaga 
gtgtctgggg accttgggtc cagtgggcct 
agcgcaggag ccggacatgt cagctcccta 
ctccccggcc cccaagacat ccagaagccc 
agacttctcc agaaaccctc cccttgtaca 
ttcgaggtcc cgcttcccac ctagggagag 
ggtcccggct tcgagacccc atcaagccag 
cattgccact gcaccggaac cgcaggcacc 
tgatctcttc tagaggggaa gagcctattc 
ccgcaaacgg cagcccccaa actgagctcc 
ccccccaagc agaacctcta agccctgaaa 
ggcctgcccc cctacggcat caccccagag 
ccacgcactc cttaggagaa ggtggcttct 
gttcccaggg ttgggccagt ccccaggtag 
tccctcgggg ccgaggccag cagggccaag 
ggccccgcct ggagcctgac cctcagcacc 
gcccccatgc cagctccctc tggagcctct 
ctggggagag tgaacagcta agagcctgca 
acccccgggc cctgcagtgc gcagccttta 
agtgggagcc cttcactgaa gtccagggct 
gtggcttccg cttctatgtc cgtcacactg 
ctggagcccc tgacatctgt gtggctggac 
ttggctctgg caggcgtcct gatggctgtg 
gccttgtttc ggggaacctc actgaccgag 
ggattccagc gggagccttg cggctccaga 
tggcacttcg tggccctggg ggccggtcca 
ctgggtccta cagggccggc gggaccgtct 
gcaaagggga gagtctgtcg gctgaaggcc 
tctttcagga ggaaaaccca ggcgtttttt 
tccttgagaa ccccacccca gagccccctg 
tggagccccc acttgctccg gcaccccgcc 
aggtigcggat cccccagatg cccgccccgc 
ctgcgtactg gaaacgagtg ggacactctg 
ggcgccccat tttcctctgc atctcccgtg 
gtgccgcggg tgccaggccc ccagcctccc 
catactggga ggctggcgag tggacatcct 
accgccagct gcagtgccgg caggaatttg 
gctgtggaca tctcccccgg cccaacatca 
attgggaagt tggctctcct tggagccagt 
gccggcaggt tcgctgtgtt gggaacaacg 
caggcccccc acagcccccc agcagagagg 
ggttccacag cgactggagc tccaagtgct 
gctctgtggt ctgccttggg agtggggcag 
gcaggaactg ggcagagctg tccaacagga 
ctggggccct gtgagagaac ttggcgctgg 
gaatgtggct ctggcacaca gcgtagagac 
ttcaacgtga cttctccgag caactgttct 
tgtcaagggc aggcctgcca ggaccgatgg 
tcctgccaag ggggaacgca gacacgggag 
agcacccgat gccctcctca actgcggccc 
tgcagccagc gccctgatga tcaatgcaag 
caggcccggc tctgcgtcta cccctactac 
gtcctggagc ggtctcccca ggatccctcc 



gcagaagggt taacgggcca ccgggggctc 60 
gtcccccact gcactcctga acttggagga 120 
cctccacgcc cagatgcctg cgtagttttt 180 
gcagtagagg gggagcgatg gagaactgga 240 
ttctgtccct ccctcagctc tgcttggatc 3 00 
cacctacaga ggagggccag ggccccgaag 3 60 
cttgctccca gccctgcggg gtgggggtgc 420 
cagtgcagct ccacccgagt ctgcccctcc 480 
tcctcccccg gggccagggt cccagacccc 540 
ggacacagtc tcggggaagg ggtggcccac 600 
aggagaccca ggagattcga gcggccagga 660 
gaatgttcgg ttatgggaga gtgccctttg 720 
ctcggagccc acccagatct gagctgtccc 780 
cgtcccctac tccaagagca gagccattct 840 
ctcccacaga actgtctgtc cacaccccat 900 
ctgctcagac agaggtggcc cccagaacca 960 
cccaggcctc tggcacagag cccccctcac 1020 
tccgtgcatc ccctcagcca cgaaggccaa 1080 
cagggagacg ccctgatcct tttccttcgg 1140 
ggccttgggg aacggggggg actcctcacg 1200 
cgggcgcctg gctgcccctg ctgagcaacg 1260 
ttgctcccag tagccctatt ccaagatgtt 1320 
gccaagcgcc ctgcccccct gagcagccag 1380 
actcccagga attcatgggc cagctgtatc 1440 
cccagcgctg tgaactgaac tgccggcccc 1500 
aaaaggtcca ggatgggacc ctgtgtcagc 1560 
gctgtctgag ccccggctgt gatgggatcc 1620 
gagtctgtgg gggtgatgat tctacctgtc 1680 
ggggccccct gggctatcag aagatcttgt 1740 
ttgcccagct ccggcctagc tccaactacc 1800 
tcatcaatgg gaactgggct gtggatcccc 1860 
ttcgatataa ccgtcctccc agggaggagg 1920 
ccaccaccca gcctgtggat gtctatatga 1980 
atcagtatgt catctcttca cctcctccaa 2040 
tcccccagct tcagccggag attctgaggg 2100 
cagcccggac cccaggcacc ctccagcgtc 2160 
cccatcccag gacacccctg gggtctccag 2220 
catgctcagc gtcctgcggg aaaggtgtct 2280 
agtcgggaga ggaactggat gaacgcagct 2340 
ctgaaccctg ccacggcacc ccatgccccc 2400 
gcagccgctc ctgtggcccc ggcacccagc 2460 
gggggggtgg ctcctcggtg cccccggagc 2520 
cccagtcttg ccagctgcgc ctctgtggcc 2580 
gctccgtgcg gtgcggccgg ggccagagaa 2640 
gtgatgaagt gagcgagcag gagtgtgcgt 2700 
cctgtgacat ggggccctgt actactgcct 2760 
cagccgagtg tgggacggga atccagcggc 2820 
ccactcgggc caggccaggg ggaagcagga 2880 
agccggcccc ctgacatgcg cgcctgcagc 2940 
tacacagggc cctggggtga gtgctcctcc 3000 
atcatctgtg tatccaaact ggggacggag 3 060 
cacctcccca ggccccctgc cctgcagccc 3120 
ttttccacgc cctggagccc atgttctcgc 3180 
gtccagtgcc tgagcaccaa ccagaccctc 3240 
tccaggaagc gcccctgtaa cagccaaccc 3300 
gacagctctc cacattgccc cctggtggta 3360 
acagccacct gttgccgctc ttgcgcacat 3420 
tgaaaggggt ccggggcacc ttcacggttt 3480 
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tctgtgccac catcggtcac ccattgatcg gcccactctg aaccccctgg ctctccagcc 3540 
tgtcccagtc tcagcaggga tgtcctccag gtgacagagg gtggcaaggt gactgacaca 3 600 
aagtgacttt cagggctgtg gtcaggccca tgtggtggtg tgatgggtgt gtgcacatat 3 660 
gcctcaggtg tgcttttggg actgcatgga tatgtgtgtg ctcaaacgtg tatcactttt 3720 
caaaaacragg ttacacagac tgagaaggac aagacctgtt tccttgagac tttcctaggt 3780 
ggaaaggaaa gcaagtctgc agttccttgc taatctgagc tacttagagt gtggtctccc 3840 
caccaactcc agttttgtgc cctaagcctc atttctcatg ttcagacctc acatcttcta 3 900 
agccgccctg tgtctctgac cccttctcat ttgcctagta tctctgcccc tgcctcccta 3 960 
attagctagg gctggggtca gccactgcca atcctgcctt actcaggaag gcaggaggaa 4020 
agagactgcc tctccagagc aaggcccagc tgggcagagg gtgaaaaaga gaaatgtgag 4080 
catccgctcc cccaccaccc cgcccagccc ctagccccac tccctgcctc ctgaaatggt 4140 
tcccacccag aactaattta ttttttatta aagatggtca tgacaaatga aaaaaaaaaa 4200 
aaaaaaataa aaaaacaaaa aaaaaaaata 4230 

<210> 34 

<211> 3699 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 3473847CB1 

<400> 34 

cgcagtgtgc tggcaaagct tgactttccc agcaggccta tgtcataggt actgtggtct 60 
ctacaataca gcagaggtat ctgaggctcc gagaggttga gtgacttgct catggctgca 120 
caaccagtaa atattggagc tggaattcag gtccacggtt tcctggctcc aaagcccatg 180 
attttttccc tcaatttatt ctgactgggg catgggggag ggggtggcct ttgggcaggg 240 
ccaccaggag cgaccaggcc cgtagagagc tgggtgcagg tacagaggaa aacctgttgt 3 00 
cgagtgtggc ccgtagttcc catttttgcc tgaatggcac atttgaaagt gttatataac 3 60 
catgtgaata ataatagttg gcctatatga gttttttaat ttgctttttg gtccgcattt 420 
ggtaacttct ttatcatcta ctatactctg ttgtgtctct tttgttgtaa tttgtaagta 480 
ggggtgagat aaagtacacc tagggtttgc tgggtttctt ccatgtcatc atgttcctcc 540 
ttgcatgggg ccaggatccg tggaggttgc ctggcaccta cgtggtggtg ctgaaggagg 600 
agacccacct ctcgcagtca gagcgcactg cccgccgcct gcaggcccag gctgcccgcc 660 . 
ggggatacct caccaagatc ctgcatgtct tccatggcct tcttcctggc ttcctggtga 720 
agatgagtgg cgacctgctg gagctggcct tgaagttgcc ccatgtcgac tacatcgagg 780 
aggactcctc tgtctttgcc cagagcatcc cgtggaacct ggagcggatt acccctccac 840 
ggtaccgggc ggatgaatac cagccccccg acggaggcag cctggtggag gtgtatctcc 900 
tagacaccag catacagagt gaccaccggg aaatcgaggg cagggtcatg gtcaccgact 9-60 
tcgagaatgt gcccgaggag gacgggaccc gcttccacag acaggccagc aagtgtgaca 1020 
gtcatggcac ccacctggca ggggtggtca gcggccggga tgccggcgtg gccaagggtg 1080 
ccagcatgcg cagcctgcgc gtgctcaact gccaagggaa gggcacggtt agcggcaccc 1140 
tcataggcct ggagtttatt cggaaaagcc agctggtcca gcctgtgggg ccactggtgg 1200 
tgctgctgcc cctggcgggt gggtacagcc gcgtcctcaa cgccgcctgc cagcgcctgg 1260 
cgagggctgg ggtcgtgctg gtcaccgctg ccggcaactt ccgggacgat gcctgcotct 1320 
actccccagc ctcagctccc gaggtcatca cagttggggc caccaatgcc caggaccagc 1380 
cggtgaccct ggggactttg gggaccaact ttggccgctg tgtggacctc tttgccccag 1440 
gggaggacat cattggtgcc tccagcgact gcagcacctg ctttgtgtca cagagtggga 1500 
catcacaggc tgctgcccac gtggctggca ttgcagccat gatgctgtct gccgagccgg 1560 
agctcaccct ggccgagttg aggcagagac tgatccactt ctctgccaaa gatgtcatca 1620 
atgaggcctg gttccctgag gaccagcggg tactgacccc caacctggtg gccgccctgc 1680 
cccccagcac ccatggggca ggttggcagc tgttttgcag gactgtgtgg tcagcacact 1740 
cggggcctac acggatggcc acagccatcg cccgctgcgc cccagatgag gagctgctga 1800 
gctgctccag tttctccagg agtgggaagc ggcggggcga gcgcatggag gcccaagggg 1860 
gcaagctggt ctgccgggcc cacaacgctt ttgggggtga gggtgtctac gccattgcca 1920 
ggtgctgcct gctaccccag gccaactgca gcgtccacac agctccacca gctgaggcca 1980 
gcatggggac ccgtgtccac tgccaccaac agggccacgt cctcacaggc tgcagctccc 2040 
actgggaggt ggaggacctt ggcacccaca agccgcctgt gctgaggcca cgaggtcagc 2100 
ccaaccagtg cgtgggccac agggaggcca gcatccacgc ttcctgctgc catgccccag 2160 
gtctggaatg caaagtcaag gagcatggaa tcccggcccc tcaggagcag gtgaccgtgg 2220 
cctgcgagga gggctggacc ctgactggct gcagtgccct ccctgggacc tcccacgtcc 2280 
tgggggccta cgccgtagac aacacgtgtg tagtcaggag ccgggacgtc agcactacag 2340 
gcagcaccag cgaagaggcc gtgacagccg ttgccatctg ctgccggagc cggcacctgg 2400 
cgcaggcctc ccaggagctc cagtgacagc cccatcccag gatgggtgtc tggggagggt 2460 
caagggctgg ggctgagctt taaaatggtt ccgacttgtc cctctctcag ccctccatgg 2520 
cctggcacga ggggatgggg atgcttccgc ctttccgggg ctgctggcct ggcccttgag 2580 
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tggggcagcc tccttgcctg gaactcactc actctgggtg cctcctcccc aggtggaggt 2640 
gccaggaagc tccctccctc actgtggggc atttcaccat tcaaacaggt cgagctgtgc 2700 
tcgggtgctg ccagctgctc ccaatgtgcc gatgtccgtg ggcagaatga cttttattga 2760 
gctcttgttc cgtgccaggc attcaatcct caggtctcca ccaaggaggc aggattcttc 2820 
ccatggatag gggagggggc ggtaggggct gcagggracaa acatcgttgg ggggtgagtg 2880 
tgaaaggtgc tgatggccct catctccagc taactgtgga gaagcccctg ggggctccct 2940 
gattaatgga ggcttagctt tctggatggc atctagccag aggctggaga caggtgtgcc 3000 
cctggtggtc acaggctgtg ccttggtttc ctgagccacc tttactctgc tctatgccag 3 060 
gctgtgctag caacacccaa aggtggcctg cggggagcca tcacctagga ctgactcggc 3120 
agtgtgcagt ggtgcatgca ctgtctcagc caacccgctc cactacccgg cagggtacac 3180 
attcgcaccc ctacttcaca gaggaagaaa cctggaacca gagggggcgt gcctgccaag 3240 
ctcacacagc aggaactgag ccagaaacgc agattgggct ggctctgaag ccaagcctct 3300 
tcttacttca cccggctggg ctcctcattt ttacgggtaa cagtgaggct gggaagggga 3360 
acacagacca ggaagctcgg tgagtgatgg cagaacgatg cctgcaggca tggaactttt 3420 
tccgttatca cccaggcctg attcactggc ctggcggaga tgcttctaag gcatggtcgg 3480 
gggagagggc caacaactgt ccctccttga gcaccagccc cacccaagc.a agcagacatt 3540 
tatcttttgg gtctgtcctc tctgttgcct ttttacagcc aacttttcta gacctgtttt 3600 
gcttttgtaa cttgaagata tttattctgg gttttgtagc atttttatta atatggtgac 3660 
tttttaaaat aaaaacaaac aaacgttgtc ctaaaaaaa 3699 

<210> 35 

<211> 2410 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 3750004CB1 

<400> 35 

cctcagcagt ggccccttcc ctccacgggc tgccccggag ctcagtccca ccccctccgc 60 
cgatgaggcc atgaggcacc gaacggacct gggccagaac ctcctgctct tcctgtgggc 120 
cctgctgaac tgtggtttgg gggtcagtgc tcagggtccg ggcgagtgga ccccgtgggt 180 
gtcctggacc cgctgctcca gctcctgcgg gcgtggcgtc tctgtgcgca gccggcgctg 240 
cctccggctt cctggggaag aaccgtgctg gggagactcc catgagtacc gcctctgcca 3 00 
gttgccagac tgccccccag gggctgtgcc cttccgagac ctacagtgtg ccctgtacaa 3 60 
tggccgccct gtcctgggca cccagaagac ctaccagtgg gtgcccttcc atggggcgcc 420 
caaccagtgc gacctcaact gcctggctga ggggcacgcc ttctaccaca gcttcggccg 480 
cgtcctggac ggcaccgcct gcagcccggg tgcccagggg gtctgcgtgg ctggccgctg 540 
ccttagcgcc ggctgtgatg ggttgttggg ctcgggtgcc ctcgaggacc gctgtggccg 600 
ctgcggaggc gccaacgact cgtgcctttt cgtgcagcgc gtgtttcgtg acgccggtgc 660 
cttcgctggg tactggaacg tgaccctgat ccccgagggc gccagacaca tccgcgtgga 720 
acacaggagc cgcaaccacc tgggtatcct aggatcactg atggggggcg atgggcgcta 780 
cgtgcttaat gggcactggg tggtcagccc accagggacc tacgaggcgg ccggcacgca 840 
tgtggtctac acccgagaca cagggcccca ggagacattg caagcagccg ggcccacctc 900 
ccatgacctg ctcctacagg tcctcctgca ggagcccaac cctggcatcg agtttgagtt 960 
ctggctccct cgggagcgct acagcccctt ccaggctcgt gtgcaggccc tgggctggcc 1020 
cctgaggcag cctcagcccc ggggggtgga gcctcagccc cccgcagccc ctgctgtcac 1080 
ccctgcacag accccaacgc tggccccaga cccctgccca ccctgccctg acacccgcgg 1140 
ccgcgcccac cgactactcc actattgcgg cagtgacttt gtgttccagg cccgagtgct 1200 
gggccaccac caccaggccc aggagacccg ctatgaggtg cgcatccagc tcgtctacaa 1260 
gaaccgctcg ccactgcggg cacgcgagta cgtgtgggcg ccaggccact gcccctgccc 1320 
gatgctggca ccccaccggg actacctgat ggctgtccag cgtcttgtca gccccgacgg 1380 
cacacaggac cagctgctgc tgccccacgc cggctacgcc cggccctgga gccctgcgga 1440 
ggacagccgc atacgcctga ctgcccggcg ctgtcctggc tgagcccctg caggagcccc 1500 
ggccacacac agcaagaaag atacatctga ccagcctcaa cgtcaacgta tttcccctct 1560 
caccctggct tccaggcagc tctgaaatac gtcccacctg tgcagctatg tgactccctc 1620 
ccacacacgc ttaagacacc tctgcatgca gtcaaagcca ctgtcacaag ccggcaggca 1680 
ctggtgagga ggcactaagg agactctgac ttttatttcg cctctctcct tggctgccag 1740 
gaagctcata gctatttata ctcagaaagt ttaacgctgc tttctttctc tttgcgcgcg 1800 
tcacacttgc ttggagacac tgtcatgaac gagcatgaca ccctgctgcc ctgggtaccc 1860 
agaagatcat ctgtttactt cccagacact gtgctgtctc tgctctctgc tactcacaca 1920 
caccctcatg tgtgaagggc agagacactg tcacaaacag gcatgcccct tagaagacat 1980 
gcctaaccag gcactgtaac gtaccaacgt accaatttcc ccttttcccc tggctaccag 2040 
gaaactcgga gacaatcttt tcagcctcag catttctggc tggatttcca cccatcaaca 2100 
cgtgcttgct cctccttttt tttttttctg aggtagacct tgctctgtca cctaggctga 2160 
agtgcggtgg tgcaatcatg gctcactgca gcctcaaatt cctgggctca agcgatcctc 2220 
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ccactcagct ccacagcagt ggaactcacg tgtgatcaca tgccggctaa tttaaatttt 2280 

gtagagatgg gcttgtacgt gccaaatgtc tcactatggc tcaacatctc tgctgggtcc 2340 

aagactgaat aggatgacat gatggtgtac cccttatcct tatttcagct ttaaaaattc 2400 

tacLaaaaaaa 2410 

<210> 36 
<211> 549 
<212> DMA 

<213> Homo sapiens 
<220> 

<221> inisc_featiire 

<223> Incyte ID No: 4904126CB1 

<400> 36 

gggaggagag aaaagccatg gccgacaagg tcctgaagga gaagagaaag cagtttatcc 60 
gttcagtggg cgaaggtaca ataaatggct tactgggtga attattggag acaagggtgc 120 
tgagccagga agagatagag atagtaaaat gtgaaaatgc tacagttatg gataaggccc 180 
gagctttgct tgactctgtt attcggaaag gggctccagc atgccaaatt tgcatcacat 240 
acatttgtga agaagacagt cacctggcag ggacgctggg actctcagca ggtccaacat 3 00 
ctggaaatca ccttactaca caagattctc aaatagtact tccttcctag gtaatgctgt 3 60 
ttttaaagaa agagcattct ttgaaccgtg gcttcccgtg acattaatgt tgtaggatga 420 
accacagtta aaggggctat gaagaattcc catagagtga tcatacaatt ttctttttgt 480 
aatctattct gcttttgtag caactgtcaa aacagcttca ctatctatgt ctacattaaa 540 
atttggaat 549 

<210> 37 
<211> 2755 
<212> DMA 

<213> Homo sapiens 
<220> 

<221> misc_f eatuxe 

<223> Incyte ID No: 71268415CB1 



tgaacatctc aagccgcccc cgggaaactg tgggttcgag cactccaagc ccaccaccag 3 60 
ggactgggct cttcagttta cacaacagac caagaagcga cctcgcagga tgaaaaggga 420 
agatttaciac tccatgaagt atgtggagct ttacctcgtg gctgattatt tagagtttca 480 
gaagaatcga cgagaccagg acgccaccaa acacaagctc atagagatcg ccaactatgt 540 
tgataagttt taccgatcct tgaacatccg gattgctctc gtgggcttgg aagtgtggac 600 
ccacgggaac atgtgtgaag tttcagagaa tccatattct accctctggt cctttctcag 660 
ttggaggcgc aagctgcttg cccagaagta ccatgacaac gcccaattaa tcacgggcat 720 
gtccttccac ggcaccacca tcggcctggc ccccctcatg gccatgtgct ctgtgtacca 780 
gtctggagga gtcaacatgg accactccga gaatgccatt ggcgtggctg c caeca tggc 840 
ccacgagatg ggccacaact ttggcatgac ccatgattct gcagattgct gctcggccag 900 
tgcggctgat ggtgggtgca tcatggcagc tgccactggg cacccctttc ccaaagtgtt 960 
caatggatgc aacaggaggg agctggacag gtatctgcag tcaggtggtg gaatgtgtct 1020 
ctccaacatg ccagacacca ggatgttgta tggaggccgg aggtgtggga acgggtatct 1080 
ggaagatggg gaagagtgtg actgtggaga agaagaggaa tgtaacaacc cctgctgceia 1140 
tgcctctaat tgtaccctga ggccgggggc ggagtgtgct cacggctcct gctgccacca 1200 
gtgtaagctg ttggctcctg ggaccctgtg ccgcgagcag gccaggcagt gtgacctccc 1260 
ggagttctgt acgggcaagt ctccccactg ccctaccaac ttctaccaga tggatggtac 1320 
cccctgtgag ggcggccagg cctactgcta caacggcatg tgcctcacct accaggagca 1380 
gtgccagcag ctgtggggac ccggagcccg acctgcccct gacctctgct tcgagaaggt 1440 
gaatgtggca ggagacacct ttggaaactg tggaaaggac atgaatggtg aacacaggaa 1500 
gtgcaacatg agagatgcga agtgtgggaa gatccagtgt cagagctctg aggcccggcc 1560 
cctggagtcc aacgcggtgc ccattgacac cactatcatc atgaatggga ggcagatcca 1620 
gtgccggggc acccacgtct accgaggtcc tgaggaggag ggtgacatgc tggacccagg 1680 
gctggtgatg actggaacca agtgtggcta caaccatatt tgctttgagg ggcagtgcag 1740 
gaacacctcc ttctttgaaa ctgaaggctg tgggaagaag tgcaatggcc atggggtctg 1800 
taacaacaac cagaactgcc actgcctgcc gggctgggcc ccgcccttct gcaacacacc 1860 
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gggccacggg ggcagtatcg acagtgggcc tatgccccct gagagtgtgg gtcctgtggt 1920 

agctggagtg ttggtggcca tcttggtgct ggcggtcctc atgctgatgt actactgctg 1980 

cagacagaac aacaaactag gccaactcaa gccctcagct ctcccttcca agctgaggca 2040 

acagttcagt - tgtcccttca gggtttctca gaacagcggg actggtcatg ccaacccaac 2100 

tttcaagctg cagacgcccc agggcaagcg aaaggtgat-c aacactccgg aaatcctgcg 2160 

gciagccctcc cage c tec tc cccggccccc tccagattat ctgcgtggtg ggtccccacc 2220 

tgcaccactg ccagctcacc tgagcagggc tgctaggaac tccccagggc ccgggtctca 2280 

aatagagagg acggagtcgt ccaggaggcc ticctccaagc cggccaattc cccccgcacc 2340 

aaattgcatc gtttcccagg acttctccag gcctcggccg ccccagaagg cactcccggc 2400 

aaacccagtg ccaggccgca ggagcctccc caggccagga ggtgcatccc cactgcggcc 2460 

ccctggtgct ggccctcagc agtcccggcc tctggcagca cttgcccccia aggtgagtcc 2520 

acgggaagcc ctcaaggtga aagctggtac cagagggctc caggggggca ggtgtagagt 2580 

tgagaaaaca aagcaattca tgcttcttgt ggtctggact gaacttccag aacaaaagcc 2640 

aagggcaaaa cattcatgtt tcttggtgcc cgcttgactg tggagttttg gcttca'tgtg 2700 

aaaggtgatt cttagaatcc tgagctgtgg tggcttcagt cctgcccctg cacct 2755 

<210> 38 
<211> 2553 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> niisc_feature 

<223> Incyte ID No: 7473301CB1 

<400> 38 

atggacaaag aaaacagcga tgtttcagcc gcacctgctg acctgaaaat atccaatatc 60 

tcagtccaag tggtcagtgc ccaaaagaag ctgccagtga gacgaccacc gttgccaggg 120 

agacgactac cattgccagg aagacgacca ccacaaagac ccattggcaa agccaaaccc 180 

aagaagcaat ccaagaaaaa agttcccttit: tggaatgtac aaaataaaat cattctcttc 240 

acagtatttt tattcatcct agcagtcata gcctggacac ttctgtggct gtatatcagt 300 

aagacagaaa gcaaagatgc tttttacttt gctgggatgt ttcgcatcac caacatcgag 3 60 

tttcttcccg aataccgaca aaaggagtcc agggaatttc tttcagtgtc acggactgtg 420 

cagcaagtga taaacctggt ttatacaaca tctgccttct ccaaatttta tgagcagtct 480 

gttgttgcag atgtcagcag caacaacaaa ggcggcctcc ttgtccactt ttggattgtt 540 

tttgtcatgc cacg'tgccaa aggccacatc t.tctgt:gaag actgtgttgc cgccatcttg 600 

aaggactcca tccagacaag catcataaac cggacctctg tggggagctt gcagggactg 660 

gctgtggaca tggactctgt ggtactaaat: ggtgattgtt ggtcattcct aaaaaaEiaag 720 

aaaagaaagg aaaatggtgc tgtctccaca gacaaaggct gctctcagta cttctatgca 780 

gagcatctgt ctctccacta cccgctggag atttctgcag cctcagggag gctgatgtgt 840 

cacttcEiagc tggtggccat agtgggctac ctgattcgtc tctcaatcaa gtccatccaa 900 

atcgaagccg acaactgtgt cactgactcc ctgaccattt acgactccct tttgcccatc 960 

cggagcagca tcttgtacag aatttgtgaa cccacaagaa cattsiatgtc atttgtttct 1020 

acaaataatc tcatgttggt gacatttaag tc tec teat a tacggaggct cteaggaatc 1080 

cgggcatatt ttgaggtcat tceagaacaa aagtgtgaaa acacagtgtt ggtcaaagac 1140 

ateactggct ttgaagggaa aatttCEiagc ecatattacc cgagctacta tcctccaaaa 1200 

tgcaagtgta cctggaaatt tcagacttct ctatcaactc ttggcatagc aetgaaattc 1260 

tataactatt eaataaccaa gaagagtatg aaaggctgtg agcatggatg gtgggaaatt 1320 

tatgageaca tgtaetgtgg etectaeatg gateateaga caatttttcg agtgcccagc 1380 

cctctggtte acattcagct ccagtgcagt teaaggettt caggcaagec acttttggca 1440 

gaatatggca gttacaacat cagtcaaccc tgecctgtgg gatcttttag atgetectcc 1500 

ggtttatgtg tcectcaggc ccagcgtggt gatggagtaa atgactgett tgatgaaagt 1560 

gatgaactgt tttgegtgag ccctcaacct gectgeaata eeagetcctt caggcageat 1620 

ggccetctca tctgtgatgg cttcagggac tgtgagaatg gccgggatga geaaaactgc 1680 

acteaaagta tteeatgcaa caacagaact tttaagtgtg gcaatgatat ttgctttagg 1740 

aaacaaaatg eaaaatgtga tgggacagtg gattgtecag atggaagtga tgaagaaggc 1800 

tgcaectgca gcaggagtte etcegcectt eaccgeatca teggaggeac agacaecctg 1860 

gaggggggtt ggccgtggca ggtcagcetc eactttgttg gatctgccta etgtggtgcc 1920 

tcagtcatct ccagggagtg gcttctttet gcagcccact gttttcatgg aaacaggctg 1980 

teagatccca caeca tggae tgcacacetc gggatgtatg ttcaggggaa tgecaagttt 2040 

gtctccccgg tgagaagaat tgtggtccac gagtactata acagtcagac ttttgattat 2100 

gatattgctt tgctacagct cagtattgcc tggcetgaga ccctgasiaca gctcattcag 2160 

ccaatatgca ttcetcecac tggtcagaga gttcgeagtg gggagaagtg etgggtaact 2220 

ggctgggggc gaagacacga ageagataat aaaggetccc tcgttctgca gcaageggag 2280 

gtagagetca ttgatcaaac gctctgtgtt tccaectacg ggatcatcac ttctcggatg 2340 

ctctgtgcag gcataatgtc aggcaagaga gatgeetgea aaggagattc gggtggaect 2400 

ttatettgtc gaagaaaaag tgatggaaaa tggattttga ctggcattgt tagctgggga 2460 
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catggatgtg gacgaccaaa ctttcctggt gtttacacaa gggtgtcaaa ctttgttccc 2520 
tggattcata aatatgtccc ttctcttttg taa 2553 

<210> 39 

<211> 1041 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> iaisc_feature 

<223> Incyte ID No: 74733 08CB1 



<400> 39 

atgttcagcg gcaacacagg aaaaacccat attatcaatg ctcaaaaacc tggccacctc 60 
aggcttagcc agttattcgt gagcagagag gtgtgtcatc tacatggcag tcatggcctg 120 
gatgggtctg gaactgtggc aagaatcctt ccaggaaaca gccggtctcc ctctctgctc 180 
tcagaaggca agtttcctta tcacctgtct gctctcagaa ggcaagtttc cttatcacct 240 
gtgaatcaca aacccacaga gtggccaaac atactgatgc aagaccatag gaaggggaaa 3 00 
gctgcagttg gtgtctcctt tgatgatgat gacaagattg ttgggggcta caactgtgag 360 
gagaattctg tcccctacca ggtgtccctg aattctggct accacttctg tgttggctcc 420 
ctcaacaggg aatactgcat ccaggtgaga ctgggagagc acaacatcga agtcctagag 480 
gggaatgaac agttcatcta tgcggtcaag atcatccgcc accccaaata caacagctgg 540 
actctggaca atgacatcct gctgatcaag ctctccacac ctgccatcat caatgcccat 600 
gtgtccacca tctctctgcc caccacccct ccagctgctg gcactgagtg cctcatctct 660 
ggctggggca acactctgag ttctggcgcc gactacccag acgagctgca gtgcctggat 720 
gctcctgtgc tgagccaggc tgagtatgaa gcctcctacc ctggaaagat taccaacaac 780 
gtgttttgtg tgggtttcct tgagggaggc aaggattcct gccagattat tcctatcaaa 840 
gtgcagcagc tggttacctc aagccaagag acagacataa ggatccctat ggccttgcag 900 
acagctgctt ccacctccta cctgggcccc ttagactctt tacacaggaa agtgagtcac 960 
cccactgaga agcgttgcca gcagaaacag ggcatgaaaa tcacagataa ccatgggatt 1020 
acttccaagt ggtcagtata a 1041 

<210> 40 

<211> 1707 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7478021CB1 

<400> 40 

atgctcgccg cctccatctt ccgtccgaca ctgctgctct gctggctggc tgctccctgg 60 
cccacccagc ccgagagtct cttccacagc cgggaccgct cggacctgga gccgtcccca 120 
ctgcgccagg ccaagcccat tgccgacctc cacgctgctc agcggttcct gtccagatac 180 
ggctggtcag gggtgtgggc ggcctggggg cccagtcccg aggggccgcc ggagaccccc 240 
aagggcgccg ccctggccga ggcggtgcgc aggttccagc gggcgaacgc gctgccggcc 300 
agcggggagc tggacgcggc caccctagcg gccatgaacc ggccgcgctg cggggtcccg 3 60 
gacatgcgcc caccgccccc ctccgccccg ccttcgcccc cgggcccgcc ccccagagcc 420 
cgctccaggc gctccccgcg ggcgccgctg tccttgtccc ggcggggttg gcagccccgg 480 
ggctaccccg acggcggagc tgcccaggcc ttctccaaga ggacgctgag ctggcggctg 540 
ctgggcgagg ccctgagcag ccaactgtcc gtggccgacc agcggcgcat agaggcgctg 600 
gccttcagga tgtggagcga ggtgacgccg ctggacttcc gcgaggacct ggccgccccc 660 
ggggccgcgg tcgacatcaa gctgggcttt gggagacggc acctgggctg tccgcgggcc 720 
ttcgatggga gcgggcagga gtttgcacac gcctggcgcc taggtgacat tcactttgac 780 
gacgacgagc acttcacacc tcccaccagt gacacgggca tcagccttct caaggtggcc 840 
gtccatgaaa ttggccatgt cctgggcttg cctcacacct acaggacggg atccataatg 900 
caaccaaatt acattcccca ggagcctgcc tttgagttgg actggtcaga caggaaagca 960 
attcaaaagc tgtatggttc ctgtgaggga tcatttgata ctgcgtttga ctggattcgc 1020 
aaagagagaa accaatatgg agaggtgatg gtgagattta gcacatattt cttccgtaac 1080 
age tgg tact ggctttatga aaatcgaaac aataggacac gctatgggga ccctatccaa 1140 
atcctcactg gctggcctgg aatcccaaca cacaacatag atgcctttgt tcacatctgg 1200 
acatggaaaa gagatgaacg ttattttttt caaggaaatc aatactggag atatgacagt 1260 
gacaaggatc aggccctcac agaagatgaa caaggaaaaa gctatcccaa attgatttca 1320 
gaaggatttc ctggcatccc aagtccccta gacacggcgt tttatgaccg aagacagaag 1380 
ttaatttact tcttcaagga gtcccttgta tttgcatttg atgtcaacag aaatcgagta 1440 
cttaattctt atccaaagag gattactgaa gtttttccag cagtaatacc acaaaatcat 1500 
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cctttcagaa atatagattc cgcttattac 
aaaggcaatg catactggaa ggtagttaat 
cctgctaatg gcttatttcc aaaaaagttt 
gtccatatct ccacactgaa catgtaa 

<210> 41 

<211> 1262 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 4333459CB1 

<400> 41 

aaaagatctt tgcgaaacac tacattcaga 
gtcttggtta atgaataaac ttgttttaaa 
tatttttgtt tgctttagtc tctctaaaat 
ttataagcag gaacaagctg attttactac 
ctgcaactct ttcgctcccc aaagctccca 
cttggaatta tttteiacatt ttcagtcgca 
cctatccctg gcaggtatct ctgaaacaaa 
tctcaccaca gtgggtgatc acggcggctc 
ctttgaatgt tactgctgga gagtatgact 
tcactattga aactgtcatc atacatccac 
atattgccct tttgaagatg gctggagcct 
gtcttccaga gctgcgggag caatttgagg 
gccgcttaac tgaaggtggc gtcctctcac 
tgacctggga agagtgtgtg gcagctctgt 
cctttctttg cacaggtttt cctgatggag 
gttcactcat gtgccggaat aagaaagggg' 
gtttgggctg tggtcgaggc tggagaaaca 
ggatcttcac agacattagt aaagtgcttt 
actaagccat cacacaaggt taagaagctg 
cagagtcctg gcaaatcaga gcacctgaac 
cacacaagga ttgtgaggtt taccaagtct 
aa 

<210> 42 

<211> 3067 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 6817347CB1 

<400> 42 

gcactgtgaa cgttggttgc atccaaatct 
agaatacacc agagctgagg gccagcccta 
tcccattata cctcctcact ccctgcagcc 
gtcacctggg gctctggtgc accagatgac 
gatccccatg gatgttaatg agcccagctc 
cctgcagcat atctcctcat tcctggccac 
tccacgtgag tgtgaagtca ccaggattga 
tgagccaggt gccaaactct tcaataatga 
gccagtgaac atfcacatcag tgtgtgtgag 
gttatcagag aaatacgagg gtattgttaa 
aattggttct aatattggtg atgcacattt 
gcctgtggta gatcctgatg accccattcc 
cagcaggatt ccttcagtga gtgtgctagt 
gcaggcattt attgaaagga tgctggggtg 
ggaagggccg ggcgggggac accccaggtc 
gccgcccgag ccctgcgagc agggggagga 
ggaggcggag acggcggaga aggcggagag 
gaaggcggag gcggcgggga aggcggaggc 
ggagacggcg gggaaggtgg acgccgctgg 



tcctatgcat acaactccat tttctttttc 1560 
gacaaggaca aacaacagaa ttcctggctt 1620 
atttcagaga agtggtttga tgtttgtgac 1680 

1707 



aacatcagat ggacatgctt gattcaccac 60 

ttggcttatt gctggtctct caaggcttcc 120 

ttcagggaaa aactatgagt ctcaaaatgc 180 

taggaatagt cttttttgsia cgaggtaaat 240 

gttgtgggca gagtctggtt aaggtacagc 3 00 

ttcttggagg aagccaagtg gagaagggtt 3 60 
ggcagaagca tatttgtgga ggaagcatcg 420 

actgcattgc aaacagaaac attgtgtcta 480 

taagccagac agacccagga gagcaaactc 540 

atttctccac caagaaacca atggactatg 600 

tccaatttgg ccactttgtg gggcccatat 660 

ctggttttat ttgtacaact gcaggctggg 720 

aagtcttgca ggaagtgaat ctgcctattt 780 

taacactaaa gaggcccatc agtigggaaga 840 

ggagagacgc atgtcaggga gattcaggag 900 

cctggactct ggctggtgtg acttcctggg 960 

atgtgaggaa aagtgatcaa ggatcccctg 1020 

cctggatcca cgaacacatc caaactggta 1080 

ccattctgct agggccagag acagcatcag 1140 

caacaggctc tacctctgtt ctcagtgtag 1200 

aaatiaaaaca agagttaaat: atggtiaaaaa 1260 

1262 



gaattttgtc tgggaccagg gtcagggacc 60 
cctgagaacc atcaacaaac ttaccccaca 120 
tgtcagcttc cccaatctcc cacactcact 180 
actacttgct ccctggtaca caggccccat 240 
cgtgaccacg gctcctaccc tcagctctag 3 00 
tggtaagaaa ctttccctcc attttggtca 360 
tgacaaaaat agciagaggat tggaagacag 420 
tggagtctgt tgttgcctgc aaaaacgggg 480 
tcccaggacc ttacaaatat cagtttttgt 540 
atttgaatcg gatgaattac cttttggtgt 600 
tcaagaattc agggctggaa tctcctggaa 660 
tcagttccct gattgctgca gcagcagcag 720 
tgcagttcct ctggttgcag gccacaaagg 780 
cttcaaggaa ttgaagcaag agctgactca 840 
tgcgtggccc ccgcgccgcc acgcccagtg 900 
gccgccgcca gtggaggcgg aggaggtaga 960 
gaaggtggag gcggaggcga aggtggaggg 1020 
ggcggggaag gtggacgcca ccgagaaggt 1080 
gaaggtggag acggcggagg gtccgggccg 1140 
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ccgggctgag ctcaagctgg agcccgaacc 
gaagcaggag ctggaggatg agaacccagc 
ggttcctccc cccacccttc cctccgatcc 
cagtcgtgcg ccgcgccgcc gaccccggcc 
gcagcctagg ccccggcccc cgccccggcc 
cctggatgtg gattttgccg tggggccacc 
gggagagaac tggaggcagg aactgcgggt 
cccagagacc aggaaaagca aggcaaagtc 
gaacagactc cactcttgcc tttcctgtgt 
tcacgagcac gcagagacga aacaacacaa 
atactgcttt atgtgtaagg actatgtata 
agagcaagga gaagctttga aattacaagc 
gtgttcagtg ccaggccttg gtgagaaatt 
agaactgctg gggcacaacc *egaggagaag 
aagaggactc atcaatcttg gcaacacgtg 
ccacacgccg atactgagag atttctttct 
tcccgagttg tgtctggtct gtgagatgtc 
cccgtctcct catgtgccct ataagttact 
agcagggtac aggcaacagg atgcccacga 
caggcactgc aaaggtgatg atgtcgggaa 
catcatagac caaatcttca caggtggcct 
tggcgtctcc accacgatag acccatgctg 
cacctccttc tggcccatga gcccagggag 
accaggaatc accaccctca cggactgctt 
aagcagtgcc aaaatcaaat gtggtagttg 
cacaatgaat aaattacctg tcgttgcctg 
gaaacagagg cgcaagatca ctacatacat 
gtttatggcc tcaagtaaag agagcagaat 
tggaaacaac gaaaataagt attccttgtt 
gagtggccac tataccagct tcatccggca 
tgccgtcatc actaaggcca gtattaagga 
ctatcacaaa caggtgctag aacatgagtc 
ctactga 



cgagccggtc cgggaggcgg agcaggagcc 1200 
gcggagcggc ggtggcggca acagcgacga 1260 
accgcggccc cccgatccct ctccgcgtcg 1320 
ccggccccag acccggctcc gtaccccgcc 1380 
ccggccccgg cgcggccctg ggggcggatg 1440 
aggctgttct cacgtgaaca gctttaaggt 1500 
tatctaccag tgcttcgtgt ggtgtggaac 1560 
ctgcatctgc catgtgtgtg gcacccatct 1620 
cttctttggc tgcttcacgg agaaacacat 1680 
cttagcagta gacctgtatt acggaggtat 1740 
tgacaaagac attgagcaaa ttgccaaaga 1800 
ctccacctca acagaggttt ctcaccagca 1860 
cccaacctgg gaaacaacca aaccagaatt 1920 
aagaatcacc tccagcttta cgatcggttt 1980" 
ctttatgaac tgcattgtcc aggccctcac 2040 
ctctgacagg caccgatgtg agatgccgag 2100 
gtcgctgttt cgggagttgt attctggaaa 2160 
gcacctggtg tggatacatg cccgccattt 2220 
gttcctcatt gcagcgttag atgtcctgca 2280 
ggcggccaac aatcccaacc actgtaactg 2340 
gcagtctgat gtcacctgtc aagcctgcca 2400 
ggacattagt ttggacttgc ctggctcttg 2460 
ggagagcagt gtgaacgggg aaagccacat 2520 
gcggaggttt acgaggccag agcacttagg 2580 
ccaaagctac caggaatcta ccaaacagct 2640 
ttttcatttc aaacggtttg aacattcagc 2700 
ttcctttcct ctggagctgg atatgacgcc 2760 
gaatggacaa ttgcagctgc caaccaatag 2820 
tgctgtggtt aatcaccaag gaaccttgga 2880 
ccacaaggac cagtggttca agtgtgatga 2940 
cgtactggac agtgsiagggt atttactgtt 3000 
agaaaaagtg aaagaaatiga acacacaagc 3060 

3067 
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